Apple Vision framework – Text extraction from image

Question

I am using Vision framework for iOS 11 to detect text on image.

The texts are getting detected successfully, but how we can get the detected text?

@Alex Already getting the regions being detected. Need solution to read that detected region. — iOS, Jun 18 '17 at 15:20
Possible duplicate of [Converting a Vision VNTextObservation to a String](https://stackoverflow.com/questions/44533148/converting-a-vision-vntextobservation-to-a-string) — Artem Novichkov, Jun 20 '17 at 16:45
@Abhishek can you look into this https://stackoverflow.com/questions/47864505/text-detection-in-images and guide me? — Pooja M. Bohora, Dec 19 '17 at 04:47
@PoojaM.Bohora We can use below combination to extract the text. The solution will work on iOS 11+ - Vision Framework + ML (Model) + Tesseract OCR Open Source. Execution Steps: 1) Vision Framework + ML will detect the text region. 2) Convert Text region to CGRect and get the text image in the region 3) Pass the strips of text image instead of passing full image to Tesseract OCR to get the text. — iOS, Dec 19 '17 at 07:21
@Abhishek ok.. for tesseract do we need to any image processing before recognising text? — Pooja M. Bohora, Dec 20 '17 at 08:50
with your suggestion I am using `pod 'TesseractOCRiOS', '4.0.0'` still it is not giving precise result. any suggestion? — Pooja M. Bohora, Dec 26 '17 at 05:43
@PoojaM.Bohora As per my experience, the result for my extraction was 70-80% accurate. There are many aspects like "Text Font Size (Does not work if Font Size is small), Tesseract configuration setting (Need to configure the Tesseract engine as per the requirement) " that affect the text extraction. Use the "Black&White" mode of Tesseract while doing the extraction of the text. Also consider the image size, bigger is better. — iOS, Jan 02 '18 at 07:26

Andy Jazz · Answer 1 · 2022-05-26T01:15:52.540

Recognizing text in an image

VNRecognizeTextRequest works starting from iOS 13.0 and macOS 10.15 and higher.

In Apple Vision you can easily extract text from image using VNRecognizeTextRequest class, allowing you to make an image analysis request that finds and recognizes text in an image.

Here's a SwiftUI solution showing you how to do it (tested in Xcode 13.4, iOS 15.5):

import SwiftUI
import Vision

struct ContentView: View {
        
    var body: some View {
        ZStack {
            Color.black.ignoresSafeArea()
            Image("imageText").scaleEffect(0.5)
            SomeText()
        }
    }
}

The logic is the following:

struct SomeText: UIViewRepresentable {
    let label = UITextView(frame: .zero)
    
    func makeUIView(context: Context) -> UITextView {
        label.backgroundColor = .clear
        label.textColor = .systemYellow
        label.textAlignment = .center
        label.font = .boldSystemFont(ofSize: 25)
        return label
    }
    func updateUIView(_ uiView: UITextView, context: Context) {
        let path = Bundle.main.path(forResource: "imageText", ofType: "png")
        let url = URL(fileURLWithPath: path!)
        let requestHandler = VNImageRequestHandler(url: url, options: [:])

        let request = VNRecognizeTextRequest { (request, _) in
            guard let obs = request.results as? [VNRecognizedTextObservation]
            else { return }

            for observation in obs {
                let topCan: [VNRecognizedText] = observation.topCandidates(1)

                if let recognizedText: VNRecognizedText = topCan.first {
                    label.text = recognizedText.string
                }
            }
        }   // non-realtime asynchronous but accurate text recognition
        request.recognitionLevel = VNRequestTextRecognitionLevel.accurate
            // nearly realtime but not-accurate text recognition
        request.recognitionLevel = VNRequestTextRecognitionLevel.fast
        try? requestHandler.perform([request])
    }
}

If you wanna know a list of supported languages for recognition, read this post please.

score 0 · Answer 2 · answered Jun 19 '17 at 04:29

0

Not exactly a dupe but similar to: Converting a Vision VNTextObservation to a String

You need to either use CoreML or another library to perform OCR (SwiftOCR, etc.)

answered Jun 19 '17 at 04:29

nathan

9,329
4
37
51

score -7 · Answer 3 · answered Jun 16 '17 at 02:50

This will return a overlay image with rectangle box on detected text

Here is the full xcode project https://github.com/cyruslok/iOS11-Vision-Framework-Demo

Hope it is helpful

// Text Detect
func textDetect(dectect_image:UIImage, display_image_view:UIImageView)->UIImage{
    let handler:VNImageRequestHandler = VNImageRequestHandler.init(cgImage: (dectect_image.cgImage)!)
    var result_img:UIImage = UIImage.init();

    let request:VNDetectTextRectanglesRequest = VNDetectTextRectanglesRequest.init(completionHandler: { (request, error) in
        if( (error) != nil){
            print("Got Error In Run Text Dectect Request");

        }else{
            result_img = self.drawRectangleForTextDectect(image: dectect_image,results: request.results as! Array<VNTextObservation>)
        }
    })
    request.reportCharacterBoxes = true
    do {
        try handler.perform([request])
        return result_img;
    } catch {
        return result_img;
    }
}

func drawRectangleForTextDectect(image: UIImage, results:Array<VNTextObservation>) -> UIImage {
    let renderer = UIGraphicsImageRenderer(size: image.size)
    var t:CGAffineTransform = CGAffineTransform.identity;
    t = t.scaledBy( x: image.size.width, y: -image.size.height);
    t = t.translatedBy(x: 0, y: -1 );

    let img = renderer.image { ctx in
        for item in results {
            let TextObservation:VNTextObservation = item
            ctx.cgContext.setFillColor(UIColor.clear.cgColor)
            ctx.cgContext.setStrokeColor(UIColor.green.cgColor)
            ctx.cgContext.setLineWidth(1)
            ctx.cgContext.addRect(item.boundingBox.applying(t))
            ctx.cgContext.drawPath(using: .fillStroke)

            for item_2 in TextObservation.characterBoxes!{
                let RectangleObservation:VNRectangleObservation = item_2
                ctx.cgContext.setFillColor(UIColor.clear.cgColor)
                ctx.cgContext.setStrokeColor(UIColor.red.cgColor)
                ctx.cgContext.setLineWidth(1)
                ctx.cgContext.addRect(RectangleObservation.boundingBox.applying(t))
                ctx.cgContext.drawPath(using: .fillStroke)
            }
        }

    }
    return img
}

Not really related to the problem, please add some more explanations to the code. — Unterbelichtet, Apr 07 '21 at 11:06

Apple Vision framework – Text extraction from image

3 Answers3

Recognizing text in an image

Linked