More specifically, how to create a custom reader that reads frames from a video and feeds them into the tensorflow model graph.
Second, how can I use opencv to decode the frames to create the custom reader if possible?
Is there any code which can better demonstrate the purpose in mind (In python)?
I am mainly working on emotion recognition through facial expression and I have videos as input in my database.
Finally, I have tried using a Queue and a QueueRunner with a Coordinator hoping to solve the problem in hand. According to the documentation in https://www.tensorflow.org/programmers_guide/threading_and_queues, the QueueRunner runs the enqueue operation which in turn, takes an operation to create one example (Can we use opencv in this operation, to create one example, to return the frames as the examples to enqueue?)
Please note that my purpose is to let the enqueue and dequeue operation to occur a the same time on different threads.
Following is my code so far:
def deform_images(images):
with tf.name_scope('current_image'):
frames_resized = tf.image.resize_images(images, [90, 160])
frame_gray = tf.image.rgb_to_grayscale(frames_resized, name='rgb_to_gray')
frame_normalized = tf.divide(frame_gray, tf.constant(255.0), name='image_normalization')
tf.summary.image('image_summmary', frame_gray, 1)
return frame_normalized
def queue_input(video_path, coord):
global frame_index
with tf.device("/cpu:0"):
# keep looping infinitely
# source: http://stackoverflow.com/questions/33650974/opencv-python-read-specific-frame-using-videocapture
cap = cv2.VideoCapture(video_path)
cap.set(1, frame_index)
# read the next frame from the file, Note that frame is returned as a Mat.
# So we need to convert that into a tensor.
(grabbed, frame) = cap.read()
# if the `grabbed` boolean is `False`, then we have
# reached the end of the video file
if not grabbed:
coord.request_stop()
return
img = np.asarray(frame)
frame_index += 1
to_retun = deform_images(img)
print(to_retun.get_shape())
return to_retun
frame_num = 1
with tf.Session() as sess:
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter('C:\\Users\\temp_user\\Documents\\tensorboard_logs', sess.graph)
tf.global_variables_initializer()
coord = tf.train.Coordinator()
queue = tf.FIFOQueue(capacity=128, dtypes=tf.float32, shapes=[90, 160, 1])
enqueue_op = queue.enqueue(queue_input("RECOLA-Video-recordings\\P16.mp4", coord))
# Create a queue runner that will run 1 threads in parallel to enqueue
# examples. In general, the queue runner class is used to create a number of threads cooperating to enqueue
# tensors in the same queue.
qr = tf.train.QueueRunner(queue, [enqueue_op] * 1)
# Create a coordinator, launch the queue runner threads.
# Note that the coordinator class helps multiple threads stop together and report exceptions to programs that wait
# for them to stop.
enqueue_threads = qr.create_threads(sess, coord=coord, start=True)
# Run the training loop, controlling termination with the coordinator.
for step in range(8000):
print(step)
if coord.should_stop():
break
frames_tensor = queue.dequeue(name='dequeue')
step += 1
coord.join(enqueue_threads)
train_writer.close()
cv2.destroyAllWindows()
Thank you!!