How to convert a video to a numpy array?

Question

How can I convert a video to a multiple of numpy arrays or a single one to use it for machine learning . I only found ways to do it on images.

Does this answer your question? [How to turn a video into numpy array?](https://stackoverflow.com/questions/42163058/how-to-turn-a-video-into-numpy-array) — Nathan Mills, May 21 '21 at 23:35

Djib2011 · Accepted Answer · 2021-05-22T09:11:54.737

A regular image is represented as a 3D Tensor with the following shape: (height, width, channels). The channels value 3 if the image is RGB and 1 if it is grayscale.

A video is a collection of N frames, where each frame is an image. You'd want to represent this data as a 4D tensor: (frames, height, width, channels).

So for example if you have 1 minute of video with 30 fps, where each frame is RGB and has a resolution of 256x256, then your tensor would look like this: (1800, 256, 256, 3), where 1800 is the number of frames in the video: 30 (fps) * 60 (seconds).

To achieve this you can basically open each individual frame of the video, store them all in a list and concatenate them together along a new axis (i.e. the "frames" dimension).

You can do this through OpenCV:

# Import the video and cut it into frames.
vid = cv2.VideoCapture('path/to/video/file')

frames = []
check = True
i = 0

while check:
    check, arr = vid.read()
    if not i % 20:  # This line is if you want to subsample your video
                    # (i.e. keep one frame every 20)
        frames.append(arr)
    i += 1

frames = np.array(frames)  # convert list of frames to numpy array

Thank you for your answer ! How can I write that in code though sir ? — Saso, May 22 '21 at 00:09

How to convert a video to a numpy array?

1 Answers1