TensorFlow video frames versus still images

Question

I am having a strange issue with TensorFlow that I suspect has a simple solution.

I am working with video data. To create my training data, I used ffmpeg to extract video frames to PNG files and then used LabelImg to create a bounding box dataset. The trained network works beautifully on these PNG files.

However, if I open a video using OpenCV and feed frames from the video to the network, it doesn't work at all. I use the OpenCV VideoCapture class like:

video = cv2.VideoCapture(path_to_video)
status, frame = video.read()
output_dict = run_inference_for_single_image(frame, detection_graph)

Note that run_inference_for_single_image is a function provided by the TensorFlow library, the same used to run detection on a PNG file opened and transformed into a numpy array.

I get a few detections, but accuracy is reduced almost to zero. If instead I save the same frame as a PNG file and feed that file into the network, it works as expected.

What do I need to change to avoid the step of saving video frames as PNG files?

P-Gn · Accepted Answer · 2018-07-09T16:18:06.463

0

OpenCV has this weird behavior of loading color images in the BGR format. You can use

im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)

to convert them into RGB before feeding them to your net.

edited Jul 09 '18 at 16:18

answered Jul 09 '18 at 07:27

P-Gn

23,115
9
87
104

Excellent! Thank you. Call should be cv2.cvtColor (capital 'C'). – Ryan Jul 09 '18 at 15:28

TensorFlow video frames versus still images

1 Answers1