How to convert BoundingBox from VNRequest to CVPixelBuffer Coordinate

Question

I try to crop a CVImageBuffer (from AVCaptureOutput) using the boundingBox of detected face from Vision (VNRequest). When I draw over the AVCaptureVideoPreviewLayer using :

let origin = previewLayer.layerPointConverted(fromCaptureDevicePoint: rect.origin)
let size = previewLayer.layerPointConverted(fromCaptureDevicePoint: rect.size.cgPoint)

to convert the boundingbox rect to the previewLayer coordinate, everything work as expected. But when I try to apply to the CVImageBuffer (with a different size), the result is wrong.

Below you can see how I convert the coordinate of the returned bounding box and how I crop the CVPixelBuffer

class ViewController: UIViewController {
  var sequenceHandler = VNSequenceRequestHandler()

  func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
     // 1
     guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

     // 2
     let detectFaceRequest = VNDetectFaceLandmarksRequest(completionHandler: { [weak self]  (request, error) in
       guard error == nil else { return }
       let cropedImage = self?.crop(request: request, imageBuffer: imageBuffer) // Croped buffer
     })

     // 3
     do {
       try sequenceHandler.perform(
         [detectFaceRequest],
         on: imageBuffer,
         orientation: .leftMirrored)
     } catch {
       print(error.localizedDescription)
     }
   }

  func crop(request: VNRequest, imageBuffer: CVImageBuffer) -> CVImageBuffer? {

     guard let results = request.results as? [VNFaceObservation], let result = results.first else { return nil }

     let width = CGFloat(CVPixelBufferGetWidth(imageBuffer)) // 1080
     let height = CGFloat(CVPixelBufferGetHeight(imageBuffer)) // 1920

     let cropRect = VNImageRectForNormalizedRect(result.boundingBox, Int(width), Int(height)) // Converting the boundingbox rect to the the image

     CVPixelBufferLockBaseAddress(imageBuffer, .readOnly)
     guard let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer) else { return nil }
     let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)

     let bytesPerPixel = 4
     let startAddress = baseAddress + Int(cropRect.origin.y) * bytesPerRow + Int(cropRect.origin.x) * bytesPerPixel

     guard let context = CGContext(data: startAddress, width: Int(cropRect.size.width), height: Int(cropRect.size.height), bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue | CGBitmapInfo.byteOrder32Little.rawValue), let data = context.data else {
       return nil
     }

     var pixelBuffer: CVPixelBuffer?
     let createBufferResult = CVPixelBufferCreateWithBytes(kCFAllocatorDefault, Int(cropRect.size.width), Int(cropRect.size.height), kCVPixelFormatType_32BGRA, data, bytesPerRow, nil, nil, nil, &pixelBuffer)

     guard createBufferResult != 0 else {
       print(createBufferResult)
       return nil
     }

     CVPixelBufferUnlockBaseAddress(imageBuffer, .readOnly)

     return pixelBuffer
   }
}

I'm with the same issue but I'm trying to convert an specific point from Vision coordinates to CVPixelBuffer of depth data. `VNImagePointForNormalizedPoint` not working as expected and I believe one of problems is because the `sequenceHandler.perform` is returning the points in leftMirrored orientation. Unfortunately, only remove the orientation did not work. — Izabella Melo, Jan 21 '22 at 20:25

How to convert BoundingBox from VNRequest to CVPixelBuffer Coordinate

0 Answers0