4

In Python, I trained an image classification model with keras to receive input as a [224, 224, 3] array and output a prediction (1 or 0). When I load the save the model and load it into xcode, it states that the input has to be in MLMultiArray format.

Is there a way for me to convert a UIImage into MLMultiArray format? Or is there a way for me change my keras model to accept CVPixelBuffer type objects as an input.

Paul Lim
  • 51
  • 1
  • 2

3 Answers3

12

In your Core ML conversion script you can supply the parameter image_input_names='data' where data is the name of your input.

Now Core ML will treat this input as an image (CVPixelBuffer) instead of a multi-array.

Alex Brown
  • 41,819
  • 10
  • 94
  • 108
Matthijs Hollemans
  • 7,706
  • 2
  • 16
  • 23
  • 2
    And once you have your model configured to accept images, you don't need to reformat your images and convert them to CVPixelBuffer manually — the new Vision framework [does that for you](https://stackoverflow.com/q/44400741/957768). – rickster Jun 15 '17 at 04:36
  • Yes, going through Vision is definitely the easiest way to do things. – Matthijs Hollemans Jun 15 '17 at 07:51
  • 1
    I tried the same. But that didn't fix my issue. Please see this link: https://forums.developer.apple.com/message/242517#242517 – skr Jul 05 '17 at 18:07
5

When you convert the caffe model to MLModel, you need to add this line:

image_input_names = 'data'

Take my own transfer script as an example, the script should be like this:

import coremltools
coreml_model = coremltools.converters.caffe.convert(('gender_net.caffemodel', 
'deploy_gender.prototxt'),
image_input_names = 'data',
class_labels = 'genderLabel.txt')
coreml_model.save('GenderMLModel.mlmodel')

And then your MLModel's input data will be CVPixelBufferRef instead of MLMultiArray. Transferring UIImage to CVPixelBufferRef would be an easy thing.

Tamás Sengel
  • 55,884
  • 29
  • 169
  • 223
  • 1
    Notice that image_input_names='data' only works if your input is really called "data". If you did not rename your input, then it's probably called "input_1" or "input__0". – Tony TRAN May 14 '18 at 08:30
2

Did not tried this, but here is how its done for the FOOD101 sample

func preprocess(image: UIImage) -> MLMultiArray? {
        let size = CGSize(width: 299, height: 299)


        guard let pixels = image.resize(to: size).pixelData()?.map({ (Double($0) / 255.0 - 0.5) * 2 }) else {
            return nil
        }

        guard let array = try? MLMultiArray(shape: [3, 299, 299], dataType: .double) else {
            return nil
        }

        let r = pixels.enumerated().filter { $0.offset % 4 == 0 }.map { $0.element }
        let g = pixels.enumerated().filter { $0.offset % 4 == 1 }.map { $0.element }
        let b = pixels.enumerated().filter { $0.offset % 4 == 2 }.map { $0.element }

        let combination = r + g + b
        for (index, element) in combination.enumerated() {
            array[index] = NSNumber(value: element)
        }

        return array
    }

https://github.com/ph1ps/Food101-CoreML

BangOperator
  • 4,377
  • 2
  • 24
  • 38
  • 3
    Ah, this won‘t work. I‘m the author of that and the only reason why I didn‘t do it like Matthijs is because I needed some preprocessing. The portion where I divide by 255 and substract and so on would be wrong with every other model except mine. To make it work for other remove those calculations from that map. – ph1psG Jul 12 '17 at 07:15
  • What should be done for grayscale images? Also my array shape is [1, 48, 48, 1] and the image size is 48x48 – Asteroid Apr 11 '22 at 21:17