Ok, your question is actually directly related to programming:)
Ad I. The format is HEIF, but you access data of the image (if you develop an iPhone app) by means of iOS APIs, so you easily get information about bitmap as CVPixelBuffer
.
Ad II.
1. Neural network gets an image as an input data.
As mentioned above, you want to get your bitmap first, so create a CVPixelBuffer
. Check out this post for example. Then you use CoreML API. You want to use MLFeatureProvider protocol. An object which conforms to is where you put your vector data with MLFeatureValue under a key name picked by you (like "pixelData").
import CoreML
class YourImageFeatureProvider: MLFeatureProvider {
let imageFeatureValue: MLFeatureValue
var featureNames: Set<String> = []
init(with imageFeatureValue: MLFeatureValue) {
featureNames.insert("pixelData")
self.imageFeatureValue = imageFeatureValue
}
func featureValue(for featureName: String) -> MLFeatureValue? {
guard featureName == "pixelData" else {
return nil
}
return imageFeatureValue
}
}
Then you use it like this, and feature value will be created with initWithPixelBuffer
initializer on MLFeatureValue
:
let imageFeatureValue = MLFeatureValue(pixelBuffer: yourPixelBuffer)
let featureProvider = YourImageFeatureProvider(imageFeatureValue: imageFeatureValue)
Remember to crop/scale image before this operation so as to your network is being fed with a vector of a proper size.
- NN analyzes it, finds required object on the image.
Use prediction
function on your CoreML model.
do {
let outputFeatureProvider = try yourModel.prediction(from: featureProvider)
//success! your output feature provider has your data
} catch {
//your model failed to predict, check the error
}
- NN returns not only determinated type of object, but cropped object itself or array of coordinates/pixels of the area that should be cropped.
This depends on your model and whether you imported it correctly. Under the assumption you did, you access output data by checking returned MLFeatureProvider
(remember that this is a protocol, so you would have to implement another one similar to what I made for you in step 1, smth like YourOutputFeatureProvider
) and there you have a bitmap and rest of the data your NN spits out.
- Application gets all required information from NN and performs necessary actions to crop an image and save it to another file or whatever.
Just reverse step 1, so from MLFeatureValue
-> CVPixelBuffer
-> UIImage
. There are plenty of questions on SO about this so I won't repeat answers.
If you are a beginner, don't expect to have results overnight, but the path is here. For an experienced dev I would estimate this work for several hours to get work done (plus model learning time and porting it to CoreML).
Apart from CoreML (maybe you find your model too sophisticated and it won't be able to port it to CoreML) check out Matthjis Hollemans' github (very good resources on different ways of porting models to iOS). He is also around here and knows a lot in the subject.