0

I am having a strange issue with TensorFlow that I suspect has a simple solution.

I am working with video data. To create my training data, I used ffmpeg to extract video frames to PNG files and then used LabelImg to create a bounding box dataset. The trained network works beautifully on these PNG files.

However, if I open a video using OpenCV and feed frames from the video to the network, it doesn't work at all. I use the OpenCV VideoCapture class like:

video = cv2.VideoCapture(path_to_video)
status, frame = video.read()
output_dict = run_inference_for_single_image(frame, detection_graph)

Note that run_inference_for_single_image is a function provided by the TensorFlow library, the same used to run detection on a PNG file opened and transformed into a numpy array.

I get a few detections, but accuracy is reduced almost to zero. If instead I save the same frame as a PNG file and feed that file into the network, it works as expected.

What do I need to change to avoid the step of saving video frames as PNG files?

Ryan
  • 5
  • 1

1 Answers1

0

OpenCV has this weird behavior of loading color images in the BGR format. You can use

im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)

to convert them into RGB before feeding them to your net.

P-Gn
  • 23,115
  • 9
  • 87
  • 104