Why do some object detection neural networks return all zeros in OpenCV 4.1.0?

Question

I have a problem with evaluating several neural networks in OpenCV 4.1.0 from Java/Scala. Networks return all zeros for fish-bike image as well as others. I observe this in:

COCO SSD512* https://github.com/weiliu89/caffe/tree/ssd
faster_rcnn_inception_v2_coco, ssd_mobilenet_v2_coco https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

Detection works with YOLOv3-spp and YOLOv3-tiny https://pjreddie.com/darknet/yolo/.

What could be wrong with this dnn cooking?

// The reproduce in Scala REPL you need a hack:

def loadLibraryScalaREPL(libraryName: String): Unit = {
  val loadLibrary0 = Runtime.getRuntime.getClass.getDeclaredMethods()(4)
  loadLibrary0.setAccessible(true)
  loadLibrary0.invoke(Runtime.getRuntime, scala.tools.nsc.interpreter.IMain.getClass, libraryName)
}

loadLibraryScalaREPL(org.opencv.core.Core.NATIVE_LIBRARY_NAME)

// To load in Java/Scala application just use System.loadLibrary.

import org.opencv.core.{Scalar, Size}
import org.opencv.dnn.Dnn
import org.opencv.imgcodecs.Imgcodecs
    
val image = Imgcodecs.imread("/root/fish-bike.jpg")
val net = Dnn.readNetFromCaffe("/tmp/VGG_coco_SSD_512x512_iter_360000.prototxt", "/tmp/VGG_coco_SSD_512x512_iter_360000.caffemodel")
val blob = Dnn.blobFromImage(image, 1/125.0, new Size(512, 512), new Scalar(104,117,123), true)
net.setInput(blob)
val layer = net.forward()
val values = new Array[Float](layer.total().toInt)
layer.get(0,0, values)
values.grouped(7).foreach(x => println(x.toList))

why are you using `toInt` in `val values = new Array[Float](layer.total().toInt)`? Wouldn't that first change your values to int (and effectively round everything to zero) then back to float (keeping them at 0.0)? — Anton Codes, Aug 07 '20 at 22:37
Hello @AntonCodes. I see the idea. No, its the size of the array. — Vladimir Protsenko, Aug 09 '20 at 05:56
Can you update your question to show the values from fish-bike.jpg, and if possible the values from one other image (using the same NN) that seems to return something you expect. — Anton Codes, Aug 11 '20 at 14:09
A naive comment: Does it have anything to do with opencv reading images in BGR while others RGB? — colt.exe, Aug 11 '20 at 16:23

score 1 · Answer 1 · answered Dec 30 '22 at 19:07

Some models expect normalized values for channel intensity. Normally, an image is represented in uint8 pixels (values ranging from 0 ~ 255). You would need to convert it to float32 (from -1 ~ 1). Basically, for such a model, your image would be interpreted as a blank picture (mostly all white pixels).

Here's a python function that could be used to normalize the image:

def processFrame(image):
    img = cv2.resize(image, (input_width, input_height)) # input sizes of detector 
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    # Normalize pixel values if using a floating model
    img_rgb = (np.float32(img_rgb) - 127.5) / 127.5

Why do some object detection neural networks return all zeros in OpenCV 4.1.0?

1 Answers1