I am having a strange issue with TensorFlow that I suspect has a simple solution.
I am working with video data. To create my training data, I used ffmpeg
to extract video frames to PNG files and then used LabelImg to create a bounding box dataset. The trained network works beautifully on these PNG files.
However, if I open a video using OpenCV and feed frames from the video to the network, it doesn't work at all. I use the OpenCV VideoCapture class like:
video = cv2.VideoCapture(path_to_video)
status, frame = video.read()
output_dict = run_inference_for_single_image(frame, detection_graph)
Note that run_inference_for_single_image
is a function provided by the TensorFlow library, the same used to run detection on a PNG file opened and transformed into a numpy array.
I get a few detections, but accuracy is reduced almost to zero. If instead I save the same frame as a PNG file and feed that file into the network, it works as expected.
What do I need to change to avoid the step of saving video frames as PNG files?