22

In my application, I used VNImageRequestHandler with a custom MLModel for object detection.

The app works fine with iOS versions before 14.5.

When iOS 14.5 came, it broke everything.

  1. Whenever try handler.perform([visionRequest]) throws an error (Error Domain=com.apple.vis Code=11 "encountered unknown exception" UserInfo={NSLocalizedDescription=encountered unknown exception}), the pixelBuffer memory is held and never released, it made the buffers of AVCaptureOutput full then new frame not came.
  2. I have to change the code as below, by copy the pixelBuffer to another var, I solved the problem that new frame not coming, but memory leak problem is still happened.

enter image description here

enter image description here

enter image description here

Because of memory leak, the app crashed after some times.

Notice that before iOS version 14.5, detection works perfectly, try handler.perform([visionRequest]) never throws any error.

Here is my code:

private func predictWithPixelBuffer(sampleBuffer: CMSampleBuffer) {
  guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
    return
  }
  
  // Get additional info from the camera.
  var options: [VNImageOption : Any] = [:]
  if let cameraIntrinsicMatrix = CMGetAttachment(sampleBuffer, kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, nil) {
    options[.cameraIntrinsics] = cameraIntrinsicMatrix
  }
  
  autoreleasepool {
    // Because of iOS 14.5, there is a bug that when perform vision request failed, pixel buffer memory leaked so the AVCaptureOutput buffers is full, it will not output new frame any more, this is a temporary work around to copy pixel buffer to a new buffer, this currently make the memory increased a lot also. Need to find a better way
    var clonePixelBuffer: CVPixelBuffer? = pixelBuffer.copy()
    let handler = VNImageRequestHandler(cvPixelBuffer: clonePixelBuffer!, orientation: orientation, options: options)
    print("[DEBUG] detecting...")
    
    do {
      try handler.perform([visionRequest])
    } catch {
      delegate?.detector(didOutputBoundingBox: [])
      failedCount += 1
      print("[DEBUG] detect failed \(failedCount)")
      print("Failed to perform Vision request: \(error)")
    }
    clonePixelBuffer = nil
  }
}

Has anyone experienced the same problem? If so, how did you fix it?

10 Rep
  • 2,217
  • 7
  • 19
  • 33
Tuan Do
  • 283
  • 1
  • 9
  • Do you also get this error with the Apple sample code projects for Core ML? – Matthijs Hollemans May 03 '21 at 09:11
  • @MatthijsHollemans This issue only happen when the `perform` function throws exception otherwise it doesn't happen. I've tried Apple sample code but since the function doesn't throws any exception, the issue doesn't happen. Do you have any idea in which case the function throw exception? Is there anything related to my custom model? – Tuan Do May 03 '21 at 23:32
  • I don't know. If you try the exact same code with another model (from Apple's download site) and the error does not occur there, then I'd say it might be something in your model. Try removing layers from the model until the error goes away, to isolate where the issue is. – Matthijs Hollemans May 04 '21 at 09:25
  • I have this issue also, it broke all my custom models : ( Issue was still present after upgrade to iOS 14.5.1 – Jon Vogel May 05 '21 at 19:15
  • Filed an issue on the CoreML tools repo. https://github.com/apple/coremltools/issues – Jon Vogel May 05 '21 at 20:25
  • @MatthijsHollemans: I've tried your suggestion using https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture code. I used several models, I've noticed that Apple Object Detector model has never threw any exception, YOLO also doesn't throw any exception but my custom model throws exception. I will try to remove layers :(. So sad because it worked before – Tuan Do May 05 '21 at 23:13
  • @JonVogel: Yeah, I tried with 14.5.1 and 14.6 beta 2, same result :( – Tuan Do May 05 '21 at 23:14
  • `autoreleasepool` can't release buffer data i have same issue while recording screen and i noticed it also occurred while use custom gallery if anyone noticed. –  May 06 '21 at 04:35
  • The CoreML tools team closed my issue out. Apparently not the correct place to ask about this issue. – Jon Vogel May 06 '21 at 05:45
  • I've heard from other people who ran into the same issue. For reasons, I am not able to try this out myself right now but it does seem to affect other people besides you too. Filing a bug report with Apple, or using one of your incident reports, will help to put this issue on their radar quicker. – Matthijs Hollemans May 06 '21 at 09:21
  • Here is my feedback to Apple. FB9098585. I included an Xcode project reproducing the issue with the YOLOv3 full precision model available on their website. – Jon Vogel May 06 '21 at 20:27
  • @MatthijsHollemans: Thanks for your suggestion, I've already filed a bug report. Hope they will check and fix it – Tuan Do May 06 '21 at 23:07
  • @JonVogel That's great. Hopefully they will fix it soon – Tuan Do May 06 '21 at 23:09
  • 1
    Apple replied to my bug report and asked me for more information. I've attached source code + your github bug report for them to check. Hopefully they will fix it soon. – Tuan Do May 08 '21 at 03:15
  • 3
    I have 3 custom object detection models, and the conversion code follows @MatthijsHollemans template from the MachineThink blog. Two models recognize 791 classes, and are SSDLite with MobileNet v2 and v3L as feature extractors, they both have the issue above. However, I have an SSDLite model that recognizes a single class, based on SSDLIte+MobileNetv3L and it works just fine. If I remove the NMS model from the pipeline, all 3 versions work without an issue. The issue seems to be multiclass non max suppression – James May 10 '21 at 15:49
  • iOD 14.6 beta 3 released today dit NOT fix the issue. – Jon Vogel May 10 '21 at 19:01
  • 3
    @James Thanks for that info. Note that it's possible to replace the NMS model from the pipeline with a neural network model that contains a single NMS layer. I'd be curious to know if that works. If not, CoreMLHelpers on GitHub has a Swift version of NMS that you can adapt to work with the model. – Matthijs Hollemans May 10 '21 at 20:49
  • @James: That is a great found, we tried to seperate our models into single-class models, the app works like a charm without any problem. Thank you very much! – Tuan Do May 13 '21 at 03:40
  • @TuanDo @Matthijs Hollemans @James i didn't understand it. can you please tell me how it works and guide me here if i am wrong. i have an animation file which i tried to record and for the recording i have `flipbook` and while recording it will use 3 class file. if the animation time is more than 20 sec it will not release the memory and crash app and if animation is 20 or less after recording it will release memory. i need to release memory while recording. is it possible? –  May 13 '21 at 09:23
  • Confusion: it will solve using only one class? or how you solved it? –  May 13 '21 at 09:31
  • 1
    @RB's: My model has 1 class now and problem has been solved, maybe you can try in your case – Tuan Do May 13 '21 at 09:34
  • @TuanDo means single class usage will be solved this issue? –  May 13 '21 at 09:37
  • 1
    @MatthijsHollemans It seems to me that any implementation of NMS that I create has unsupported CoreML operations. Whether I try to use the built-in `tf.image.combined_non_max_suppression` or create my own TF version. I will test the swift implementation from CoreMLHelpers (thanks!), my only concern is speed, but I can benchmark – James May 13 '21 at 11:47
  • 1
    @RB's from what I can tell you have a 3 class object detection model. The NMS stage for the 3 classes is what's causing your problem. If you change your architecture to 1 class object detection, and then maybe a 3-class classifier to run the bounding box through you won't have this issue. That might be a drastic change to your setup, and I think (hope) Apple will fix this bug soon, so your alternative solution is to wait for a fix – James May 13 '21 at 11:51
  • @James in my case 3 class continuously call while recording it will create recursive tree that's why it will can't release frame buffer. Right? –  May 13 '21 at 11:58
  • The iOS 14.6 RC on the developer portal does not fix this issue. WTF Apple!? – Jon Vogel May 17 '21 at 19:36
  • is that delegate declared as weak? – Juan Boero Jul 14 '21 at 19:26

3 Answers3

4

iOS 14.7 Beta available on the developer portal seems to have fixed this issue.

Jon Vogel
  • 5,244
  • 1
  • 39
  • 54
  • I used yolov5 as a .mlmodel in my app and frequentently got this error message: `Error Domain=com.apple.vis Code=11 "encountered unknown exception" UserInfo={NSLocalizedDescription=encountered unknown exception} `. When i updated to iOS 14.7.1 i was not able to reproduce these errors, so seems to work – leon-w Aug 12 '21 at 11:40
3

I have a partial fix for this using @Matthijs Hollemans CoreMLHelpers library.

The model I use has 300 classes and 2363 anchors. I used a lot of the code Matthijs provided here to convert the model to MLModel.

In the last step a pipeline is built using the 3 sub models: raw_ssd_output, decoder, and nms. For this workaround you need to remove the nms model from the pipeline, and output raw_confidence and raw_coordinates.

In your app you need to add the code from CoreMLHelpers.

Then add this function to decode the output from your MLModel:

    func decodeResults(results:[VNCoreMLFeatureValueObservation]) -> [BoundingBox] {
        let raw_confidence: MLMultiArray = results[0].featureValue.multiArrayValue!
        let raw_coordinates: MLMultiArray = results[1].featureValue.multiArrayValue!
        print(raw_confidence.shape, raw_coordinates.shape)
        var boxes = [BoundingBox]()
        let startDecoding = Date()
        for anchor in 0..<raw_confidence.shape[0].int32Value {
            var maxInd:Int = 0
            var maxConf:Float = 0
            for score in 0..<raw_confidence.shape[1].int32Value {
                let key = [anchor, score] as [NSNumber]
                let prob = raw_confidence[key].floatValue
                if prob > maxConf {
                    maxInd = Int(score)
                    maxConf = prob
                }
            }
            let y0 = raw_coordinates[[anchor, 0] as [NSNumber]].doubleValue
            let x0 = raw_coordinates[[anchor, 1] as [NSNumber]].doubleValue
            let y1 = raw_coordinates[[anchor, 2] as [NSNumber]].doubleValue
            let x1 = raw_coordinates[[anchor, 3] as [NSNumber]].doubleValue
            let width = x1-x0
            let height = y1-y0
            let x = x0 + width/2
            let y = y0 + height/2
            let rect = CGRect(x: x, y: y, width: width, height: height)
            let box = BoundingBox(classIndex: maxInd, score: maxConf, rect: rect)
            boxes.append(box)
        }
        let finishDecoding = Date()
        let keepIndices = nonMaxSuppressionMultiClass(numClasses: raw_confidence.shape[1].intValue, boundingBoxes: boxes, scoreThreshold: 0.5, iouThreshold: 0.6, maxPerClass: 5, maxTotal: 10)
        let finishNMS = Date()
        var keepBoxes = [BoundingBox]()
        
        for index in keepIndices {
            keepBoxes.append(boxes[index])
        }
        print("Time Decoding", finishDecoding.timeIntervalSince(startDecoding))
        print("Time Performing NMS", finishNMS.timeIntervalSince(finishDecoding))
        return keepBoxes
    }

Then when you receive the results from Vision, you call the function like this:

if let rawResults = vnRequest.results as? [VNCoreMLFeatureValueObservation] {
   let boxes = self.decodeResults(results: rawResults)
   print(boxes)
}

This solution is slow because of the way I move the data around and formulate my list of BoundingBox types. It would be much more efficient to process the MLMultiArray data using underlying pointers, and maybe use Accelerate to find the maximum score and best class for each anchor box.

James
  • 3,957
  • 4
  • 37
  • 82
  • Hey James! I'm also encountering weird memory spikes when using MLModel, without even calling predict. Simply by init him (2-3GB RAM used without doing anything). Model weights less than <1MB. Could it be connected to the layers/classes the model stores? (pretty newbie in this field): Here is are SOF: https://stackoverflow.com/questions/67968988/mlmodel-crash-app-on-init-due-to-memory-issue – Roi Mulia Jun 14 '21 at 14:46
1

In my case it helped to disable neural engine by forcing CoreML to run on CPU and GPU only. This is often slower but doesn't throw the exception (at least in our case). At the end we implemented a policy to force some of our models to not run on neural engine for certain iOS devices.

See MLModelConfiguration.computeUntis to constraint the hardware coreml model can use.

  • Thank you @Michael. I've tried that but in my case, it still throws an exception. We are trying to change the model, Hope it will work – Tuan Do May 11 '21 at 23:14