13

I am trying to use VNDetectFaceRectanglesRequest from the new Vision API to detect faces on images. Then, I draw a red rectangle on each detected face.

But I'm having an issue converting the boundingBox from VNFaceObservation into a CGRect. It seems that my only problem is a the y origin.


Here's my code:

let request=VNDetectFaceRectanglesRequest{request, error in
    var final_image=UIImage(ciImage: image)
    if let results=request.results as? [VNFaceObservation]{
        for face_obs in results{
          UIGraphicsBeginImageContextWithOptions(final_image.size, false, 1.0)
          final_image.draw(in: CGRect(x: 0, y: 0, width: final_image.size.width, height: final_image.size.height))

          var rect=face_obs.boundingBox
/*/*/*/ RESULT 2 is when I uncomment this line to "flip" the y  /*/*/*/
          //rect.origin.y=1-rect.origin.y
          let conv_rect=CGRect(x: rect.origin.x*final_image.size.width, y: rect.origin.y*final_image.size.height, width: rect.width*final_image.size.width, height: rect.height*final_image.size.height)

          let c=UIGraphicsGetCurrentContext()!
          c.setStrokeColor(UIColor.red.cgColor)
          c.setLineWidth(0.01*final_image.size.width)
          c.stroke(conv_rect)

          let result=UIGraphicsGetImageFromCurrentImageContext()
          UIGraphicsEndImageContext()

          final_image=result!
        }
    }
    DispatchQueue.main.async{
        self.image_view.image=final_image
    }
}


let handler=VNImageRequestHandler(ciImage: image)
DispatchQueue.global(qos: .userInteractive).async{
    do{
        try handler.perform([request])
    }catch{
        print(error)
    }
}

Here are the results so far.

Result 1 (without flipping the y) Result 1

Result 2 (flipping the y) Result 2



Solution

I found a solution on my own for the rect.

let rect=face_obs.boundingBox
let x=rect.origin.x*final_image.size.width
let w=rect.width*final_image.size.width
let h=rect.height*final_image.size.height
let y=final_image.size.height*(1-rect.origin.y)-h
let conv_rect=CGRect(x: x, y: y, width: w, height: h)

However, I marked @wei-jay's answer as the good one since it's more classy.

Marie Dm
  • 2,637
  • 3
  • 24
  • 43

4 Answers4

30

There are built-in methods that would do it for you. To convert from normalized form use this:

func VNImageRectForNormalizedRect(_ normalizedRect: CGRect, _ imageWidth: Int, _ imageHeight: Int) -> CGRect

And vice-versa:

func VNNormalizedRectForImageRect(_ imageRect: CGRect, _ imageWidth: Int, _ imageHeight: Int) -> CGRect

Similar methods for points:

func VNNormalizedFaceBoundingBoxPointForLandmarkPoint(_ faceLandmarkPoint: vector_float2, _ faceBoundingBox: CGRect, _ imageWidth: Int, _ imageHeight: Int) -> CGPoint
func VNImagePointForNormalizedPoint(_ normalizedPoint: CGPoint, _ imageWidth: Int, _ imageHeight: Int) -> CGPoint
bitemybyte
  • 971
  • 1
  • 10
  • 24
  • This should be the designated answer, being Apple's official way of going back from relative sizes to the image sizes. One question though - do these conversions work also for Images with different orientations? (Portrait / Landscape) ? – Motti Shneor Aug 06 '19 at 16:00
12

You have to do the transition and scale according to the image. Example

func drawVisionRequestResults(_ results: [VNFaceObservation]) {
    print("face count = \(results.count) ")
    previewView.removeMask()

    let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -self.view.frame.height)

    let translate = CGAffineTransform.identity.scaledBy(x: self.view.frame.width, y: self.view.frame.height)

    for face in results {
        // The coordinates are normalized to the dimensions of the processed image, with the origin at the image's lower-left corner.
        let facebounds = face.boundingBox.applying(translate).applying(transform)
        previewView.drawLayer(in: facebounds)
    }
}
Willjay
  • 6,381
  • 4
  • 33
  • 58
  • 4
    this answer didn't work for me. To make it work I had to change the transform to: let transform = CGAffineTransform(scaleX: -1, y: -1).translatedBy(x: -self.view.frame.width, y: -self.view.frame.height) – zumzum Oct 06 '17 at 22:10
  • 1
    @zumzum your code work for me too, As per my observation above accepted answer won't work for front camera, and in that case your code works perfect. Because above accepted answer works fine when I have used back camera of my iPhone. – Dhaval Dobariya Jun 12 '18 at 12:48
5

I tried multiple ways and here's what worked best for me:

dispatch_async(dispatch_get_main_queue(), ^{
    VNDetectedObjectObservation * newObservation = request.results.firstObject;
    if (newObservation) {
        self.lastObservation = newObservation;
        CGRect transformedRect = newObservation.boundingBox;
        CGRect convertedRect = [self.previewLayer rectForMetadataOutputRectOfInterest:transformedRect];
        self.highlightView.frame = convertedRect;
    }
});
Vibhor Goyal
  • 2,405
  • 2
  • 22
  • 35
  • For those looking for a Swift equivalent, I have found something very similar in this tutorial: https://github.com/jeffreybergier/Blog-Getting-Started-with-Vision. I have removed the line that flips the origin and in my case it works: `guard let newObservation = request.results?.first as? VNFaceObservation else { return }` `let transformedRect = newObservation.boundingBox;` `let convertedRect = self.cameraLayer.layerRectConverted(fromMetadataOutputRect: transformedRect);` `self.highlightView?.frame = convertedRect` – Giorgio Tempesta Jul 30 '18 at 21:44
4
var rect = CGRect()
rect.size.height = viewSize.height * boundingBox.width
rect.size.width = viewSize.width * boundingBox.height
rect.origin.x = boundingBox.origin.y * viewSize.width
rect.origin.y = boundingBox.origin.x * viewSize.height
App Dev Guy
  • 5,396
  • 4
  • 31
  • 54
Lee Irvine
  • 3,057
  • 1
  • 15
  • 12