Run inference with an openCV image

Question

I have an Android Project with OpenCV4.0.1 and TFLite installed. And I want to make an inference with a pretrained MobileNetV2 of an cv::Mat which I extracted and cropped from a CameraBridgeViewBase (Android style). But it's kinda difficult.

I followed this example.

That does the inference about a ByteBuffer variable called "imgData" (line 71, class: org.tensorflow.lite.examples.classification.tflite.Classifier)

That imgData looks been filled on the method called "convertBitmapToByteBuffer" from the same class (line 185), adding pixel by pixel form a bitmap that looks to be cropped little before.

private int[] intValues = new int[224 * 224];
Mat _croppedFace = new Mat() // Cropped image from CvCameraViewFrame.rgba() method.

float[][] outputVal = new float[1][1]; // Output value from my MobileNetV2 // trained model (i've changed the output on training, tested on python)

// Following: https://stackoverflow.com/questions/13134682/convert-mat-to-bitmap-opencv-for-android
Bitmap bitmap = Bitmap.createBitmap(_croppedFace.cols(), _croppedFace.rows(), Bitmap.Config.ARGB_8888);
Utils.matToBitmap(_croppedFace, bitmap);

convertBitmapToByteBuffer(bitmap); // This call should be used as the example one.
// runInference();
_tflite.run(imgData, outputVal);

But, it looks that the input_shape of my NN is not correct, but I'm following the MobileNet example because my NN it's a MobileNetV2.

Pablo · Accepted Answer · 2019-03-29T08:58:13.217

I've solved the error, but I'm sure that it isn't the best way to do it.

Keras MobilenetV2 input_shape is: (nBatches, 224, 224, nChannels). I just want to predict a single image, so, nBaches == 1, and I'm working on RGB mode, so nChannels == 3

// Nasty nasty, but works. nBatches == 2? -- _cropped.shape() == (244, 244), 3 channels.
float [][][][] _inputValue = new float[2][_cropped.cols()][_cropped.rows()][3];

// Fill the _inputValue
for(int i = 0; i < _croppedFace.cols(); ++i)
    for (int j = 0; j < _croppedFace.rows(); ++j)
        for(int z = 0; z < 3; ++z)
            _inputValue [0][i][j][z] = (float) _croppedFace.get(i, j)[z] / 255; // DL works better with 0:1 values.

/*
Output val, has this shape, but I don't  really know why.
I'm sure that one's of that 2's is for nClasses (I'm working with 2 classes)
But I don't really know why it's using the other one.
*/
 float[][] outputVal = new float[2][2];
// Tensorflow lite interpreter
_tflite.run(_inputValue , outputVal);

On python has the same shape: Python prediction: [[XXXXXX, YYYYY]] <- Sure for the last layer that I made, this is just a prototype NN.

Hope some one got help, and also that someone can improve the answer because this is not very optimized.

Run inference with an openCV image

1 Answers1