I've been trying to use the latest MobileNet, MobileNet_v3, to run object detection. You can find Google's pre-trained models for this such as the one I'm trying to use, "ssd_mobilenet_v3_large_coco", from here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
I don't know how these new models take image data input, and I can't find any in-depth documentation about this online. The following java code summarizes how I'm attempting to feed the model (specifically the .tflite model using TensorFlow Lite) image data from the limited amount I can gather online, but the model only returns prediction confidences of order 10^-20, so it never actually recognizes anything. I figure from this that I must be doing this wrong.
//Note that the model takes a 320 x 320 image
//Get image data as integer values
private int[] intValues;
intValues = new int[320 * 320];
private Bitmap croppedBitmap = null;
croppedBitmap = Bitmap.createBitmap(320, 320, Config.ARGB_8888);
croppedBitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
//create ByteBuffer as input for running ssd_mobilenet_v3
private ByteBuffer imgData;
imgData = ByteBuffer.allocateDirect(320 * 320 * 3);
imgData.order(ByteOrder.nativeOrder());
//fill Bytebuffer
//Note that & 0xFF is for just getting the last 8 bits, which converts to RGB values here
imgData.rewind();
for (int i = 0; i < inputSize; ++i) {
for (int j = 0; j < inputSize; ++j) {
int pixelValue = intValues[i * inputSize + j];
// Quantized model
imgData.put((byte) ((pixelValue >> 16) & 0xFF));
imgData.put((byte) ((pixelValue >> 8) & 0xFF));
imgData.put((byte) (pixelValue & 0xFF));
}
}
// Set up output buffers
private float[][][] output0;
private float[][][][] output1;
output0 = new float[1][2034][91];
output1 = new float[1][2034][1][4];
//Create input HashMap and run the model
Object[] inputArray = {imgData};
Map<Integer, Object> outputMap = new HashMap<>();
outputMap.put(0, output0);
outputMap.put(1, output1);
tfLite.runForMultipleInputsOutputs(inputArray, outputMap);
//Examine Confidences to see if any significant detentions were made
for (int i = 0; i < 2034; i++) {
for (int j = 0; j < 91; j++) {
System.out.println(output0[0][i][j]);
}
}