Questions tagged [apple-vision]

Apple Vision is a high-level computer vision framework to identify faces, detect and track features, and classify images, video, tabular data, audio and motion sensors data.

Apple Vision framework performs face and face landmark detection on input images and video, barcode recognition, image registration, text detection, and, of course, feature tracking. Vision API allows the use of custom CoreML models for tasks like classification or object detection.

205 questions
87
votes
2 answers

iOS revert camera projection

I'm trying to estimate my device position related to a QR code in space. I'm using ARKit and the Vision framework, both introduced in iOS11, but the answer to this question probably doesn't depend on them. With the Vision framework, I'm able to get…
Guig
  • 9,891
  • 7
  • 64
  • 126
58
votes
8 answers

Converting a Vision VNTextObservation to a String

I'm looking through the Apple's Vision API documentation and I see a couple of classes that relate to text detection in UIImages: 1) class VNDetectTextRectanglesRequest 2) class VNTextObservation It looks like they can detect characters, but I don't…
Adrian
  • 16,233
  • 18
  • 112
  • 180
24
votes
3 answers

Apple Vision framework – Text extraction from image

I am using Vision framework for iOS 11 to detect text on image. The texts are getting detected successfully, but how we can get the detected text?
iOS
  • 5,450
  • 5
  • 23
  • 25
16
votes
2 answers

Using iPhone TrueDepth sensor to detect a real face vs photo?

How can I use the depth data captured using iPhone true-depth Camera to distinguish between a real human 3D face and a photograph of the same? The requirement is to use it for authentication. What I did: Created a sample app to get a continuous…
abhimuralidharan
  • 5,752
  • 5
  • 46
  • 70
13
votes
3 answers

Convert VNRectangleObservation points to other coordinate system

I need to convert the VNRectangleObservation received CGPoints (bottomLeft, bottomRight, topLeft, topRight) to another coordinate system (e.g. a view's coordinate on screen). I define a request: // Rectangle Request let…
mihaicris
  • 337
  • 3
  • 10
11
votes
3 answers

Classify faces from VNFaceObservation

I'm working with Vision framework to detect faces and objects on multiple images and works fantastic. But I have a question that I can't find on documentation. The Photos app on iOS classify faces and you can click on face and show all the images…
mhergon
  • 1,688
  • 1
  • 18
  • 39
10
votes
0 answers

How to convert BoundingBox from VNRequest to CVPixelBuffer Coordinate

I try to crop a CVImageBuffer (from AVCaptureOutput) using the boundingBox of detected face from Vision (VNRequest). When I draw over the AVCaptureVideoPreviewLayer using : let origin = previewLayer.layerPointConverted(fromCaptureDevicePoint:…
Alak
  • 1,329
  • 3
  • 11
  • 18
10
votes
3 answers

Apple Vision image recognition

As many other developers, I have plunged myself into Apple's new ARKit technology. It's great. For a specific project however, I would like to be able to recognise (real-life) images in the scene, to either project something on it (just like…
10
votes
3 answers

ARKit and Vision frameworks for Object Recognition

I would really like some guidance on combining Apple's new Vision API with ARKit in a way that enables object recognition. This would not need to track the moving object, just recognize it stable in 3d space for the AR experience to react…
cnzac
  • 435
  • 3
  • 13
9
votes
1 answer

ARKit and RealityKit - ARSessionDelegate is retaining 14 ARFrames

I am classifying images per frame from ARSession delegate by Vision framework and CoreML in an Augmented Reality app, with ARKit and RealityKit. While processing a frame.capturedImage I am not requesting another frame.capturedImage for…
Tanvirgeek
  • 540
  • 1
  • 9
  • 17
8
votes
3 answers

Which languages are available for text recognition in Vision framework?

I'm trying to add the option to my app to allow for different languages when using Apple's Vision framework for recognising text. There seems to be a function for programmatically returning the supported languages but I'm not sure if I'm calling it…
mralexhay
  • 1,164
  • 1
  • 10
  • 16
8
votes
1 answer

ARSCNView renders its content at 120 fps (but I need 30 fps)

I'm developing ARKit app along with Vision/AVKit frameworks. I'm using MLModel for classification of my hand gestures. My app recognizes Victory, Okey and ¡No pasarán! hand gestures for controlling a video. The app works fine but view's content is…
Andy Jazz
  • 49,178
  • 17
  • 136
  • 220
8
votes
1 answer

Apple Vision – Can't recognize a single number as region

I want to use VNDetectTextRectanglesRequest from a Vision framework to detect regions in an image containing only one character, number '9', with the white background. I'm using following code to do this: private func performTextDetection() { …
AndrzejZ
  • 245
  • 2
  • 10
8
votes
2 answers

VNTrackRectangleRequest internal error

I'm trying to get a simple rectangle tracking controller going, and I can get rectangle detection going just fine, but the tracking request always ends up failing for a reason I can't quite find. Sometimes the tracking request will fire it's…
Andy Heard
  • 1,715
  • 1
  • 15
  • 25
8
votes
3 answers

Vision Framework Barcode detection for iOS 11

I've been implementing a test of the new Vision framework which Apple introduced in WWDC2017. I am specifically looking at the barcode detection - I've been able to get after scanning the image from Camera/Gallery that it's a barcode image or not.…
Hitesh Arora
  • 81
  • 1
  • 4
1
2 3
13 14