How to read mp4 video to be processed by scikit-image?

Question

I would like to apply a scikit-image function (specifically the template matching function match_template) to the frames of a mp4 video, h264 encoding. It's important for my application to track the time of each frame, but I know the framerate so I can easily calculate from the frame number.

Please note that I'm running on low resources, and I would like to keep dependencies as slim as possible: numpy is needed anyway, and since I'm planning to use scikit-image, I would avoid importing (and compiling) openCV just to read the video.

I see at the bottom of this page that scikit-image can seamleassly process video stored as a numpy array, obtaining that would thus be ideal.

Well, I tried openCV while developing a prototype of my application on PC. But since I'm going to deliver the app on raspberry pi, I'm evaluating lighter alternatives, also considering the effort and dependencies to compile opencv on raspi. — gaggio, Apr 19 '15 at 00:41
See also [this overview](https://github.com/danielballan/scikit-image/blob/video-guide/doc/source/user_guide/video.txt) that we are preparing for the user guide. — Stefan van der Walt, Apr 20 '15 at 18:15
@StefanvanderWalt: The overview is actually really helpful, thanks. It might be improved adding `imageio` which does also solve the problem of accessing to a specific frame number that is also mentioned in your review. — gaggio, Apr 21 '15 at 10:38
@gaggio Would you kindly make that comment on the pull request, then I'm sure the author would gladly incorporate it. — Stefan van der Walt, Apr 22 '15 at 21:18

head7 · Accepted Answer · 2015-04-21T21:39:49.143

Imageio python package should do what you want. Here is a python snippet using this package:

import pylab
import imageio
filename = '/tmp/file.mp4'
vid = imageio.get_reader(filename,  'ffmpeg')
nums = [10, 287]
for num in nums:
    image = vid.get_data(num)
    fig = pylab.figure()
    fig.suptitle('image #{}'.format(num), fontsize=20)
    pylab.imshow(image)
pylab.show()

enter image description here

You can also directly iterate over the images in the file (see the documentation ):

for i, im in enumerate(vid):
    print('Mean of frame %i is %1.1f' % (i, im.mean()))

To install imageio you can use pip:

pip install imageio

An other solution would be to use moviepy (which use a similar code to read video), but I think imageio is lighter and does the job.

response to first comment

In order to check if the nominal frame rate is the same over the whole file, you can count the number of frame in the iterator:

count = 0
try:
    for _ in vid:
        count += 1
except RuntimeError:
    print('something went wront in iterating, maybee wrong fps number')
finally:
    print('number of frames counted {}, number of frames in metada {}'.format(count, vid.get_meta_data()['nframes']))


In [10]: something went wront in iterating, maybee wrong fps number
         number of frames counted 454, number of frames in metada 461

In order to display the timestamp of each frame:

try:
    for num, image in enumerate(vid.iter_data()):
        if num % int(vid._meta['fps']):
            continue
        else:
            fig = pylab.figure()
            pylab.imshow(image)
            timestamp = float(num)/ vid.get_meta_data()['fps']
            print(timestamp)
            fig.suptitle('image #{}, timestamp={}'.format(num, timestamp), fontsize=20)
            pylab.show()
except RuntimeError:
    print('something went wrong')

Thanks for the useful link and example. What is left out from this is just the time corresponding to each frame. I'll give it a go, the time might be in the frame metadata, otherwise it has to be calculated from frame number, which is ok under the assumption that the nominal framerate is correctly maintained throughout video recording. — gaggio, Apr 21 '15 at 10:32
Good question, and on some video the nominal framerate isn't correctly maintained. To check it you can count the number of frame in the iterator, and compare to the number of frame in the metadata, if they are equals you can compute the timestamp of each frame based on frame_number / fps_rate. I have updated my answer to compare the two numbers. — head7, Apr 21 '15 at 21:21
This works well for me. Thanks very much. IMO much friendlier than ffmpeg / avconv / opencv / scikit-video's opencv feel. Wish I had discovered this earlier. — rd11, May 26 '16 at 15:17

score 24 · Answer 2 · answered Aug 17 '15 at 02:04

You could use scikit-video, like this:

from skvideo.io import VideoCapture

cap = VideoCapture(filename)
cap.open()

while True:
    retval, image = cap.read()
    # image is a numpy array containing the next frame
    # do something with image here
    if not retval:
        break

This uses avconv or ffmpeg under the hood. The performance is quite good, with a small overhead to move the data into python compared to just decoding the video in avconv.

The advantage of scikit-video is that the API is exactly the same as the video reading/writing API of OpenCV; just replace cv2.VideoCapture with skvideo.io.VideoCapture.

Win GATE ECE · Answer 3 · 2018-05-10T06:18:02.990

3

An easy way to read video in python is using skviode. A single line code can help to read entire video.

import skvideo.io  
videodata = skvideo.io.vread("video_file_name")  
print(videodata.shape)

http://mllearners.blogspot.in/2018/01/scikit-video-skvideo-tutorial-for.html

edited May 10 '18 at 06:18

answered Jan 19 '18 at 11:00

Win GATE ECE

497
4
3

How to read mp4 video to be processed by scikit-image?

3 Answers3

Linked