Getting video properties with Python without calling external software

Question

[Update:] Yes, it is possible, now some 20 months later. See Update3 below! [/update]

Is that really impossible? All I could find were variants of calling FFmpeg (or other software). My current solution is shown below, but what I really would like to get for portability is a Python-only solution that doesn't require users to install additional software.

After all, I can easily play videos using PyQt's Phonon, yet I can't get simply things like dimension or duration of the video?

My solution uses ffmpy (http://ffmpy.readthedocs.io/en/latest/ffmpy.html ) which is a wrapper for FFmpeg and FFprobe (http://trac.ffmpeg.org/wiki/FFprobeTips). Smoother than other offerings, yet it still requires an additional FFmpeg installation.

    import ffmpy, subprocess, json
    ffprobe = ffmpy.FFprobe(global_options="-loglevel quiet -sexagesimal -of json -show_entries stream=width,height,duration -show_entries format=duration -select_streams v:0", inputs={"myvideo.mp4": None})
    print("ffprobe.cmd:", ffprobe.cmd)  # printout the resulting ffprobe shell command
    stdout, stderr = ffprobe.run(stderr=subprocess.PIPE, stdout=subprocess.PIPE)
    # std* is byte sequence, but json in Python 3.5.2 requires str
    ff0string = str(stdout,'utf-8')

    ffinfo = json.loads(ff0string)
    print(json.dumps(ffinfo, indent=4)) # pretty print

    print("Video Dimensions: {}x{}".format(ffinfo["streams"][0]["width"], ffinfo["streams"][0]["height"]))
    print("Streams Duration:", ffinfo["streams"][0]["duration"])
    print("Format Duration: ", ffinfo["format"]["duration"])

Results in output:

    ffprobe.cmd: ffprobe -loglevel quiet -sexagesimal -of json -show_entries stream=width,height,duration -show_entries format=duration -select_streams v:0 -i myvideo.mp4
    {
        "streams": [
            {
                "duration": "0:00:32.033333",
                "width": 1920,
                "height": 1080
            }
        ],
        "programs": [],
        "format": {
            "duration": "0:00:32.064000"
        }
    }
    Video Dimensions: 1920x1080
    Streams Duration: 0:00:32.033333
    Format Duration:  0:00:32.064000

UPDATE after several days of experimentation: The hachoire solution as proposed by Nick below does work, but will give you a lot of headaches, as the hachoire responses are too unpredictable. Not my choice.

With opencv coding couldn't be any easier:

import cv2
vid = cv2.VideoCapture( picfilename)
height = vid.get(cv2.CAP_PROP_FRAME_HEIGHT) # always 0 in Linux python3
width  = vid.get(cv2.CAP_PROP_FRAME_WIDTH)  # always 0 in Linux python3
print ("opencv: height:{} width:{}".format( height, width))

The problem is that it works well on Python2 but not on Py3. Quote: "IMPORTANT NOTE: MacOS and Linux packages do not support video related functionality (not compiled with FFmpeg)" (https://pypi.python.org/pypi/opencv-python).

On top of this it seems that opencv needs the presence of the binary packages of FFmeg at runtime (https://docs.opencv.org/3.3.1/d0/da7/videoio_overview.html).

Well, if I need an installation of FFmpeg anyway, I can stick to my original ffmpy example shown above :-/

Thanks for the help.

UPDATE2: master_q (see below) proposed MediaInfo. While this failed to work on my Linux system (see my comments), the alternative of using pymediainfo, a py wrapper to MediaInfo, did work. It is simple to use, but it takes 4 times longer than my initial ffprobe approach to obtain duration, width and height, and still needs external software, i.e. MediaInfo:

from pymediainfo import MediaInfo
media_info = MediaInfo.parse("myvideofile")
for track in media_info.tracks:
    if track.track_type == 'Video':
        print("duration (millisec):", track.duration)
        print("width, height:", track.width, track.height)

UPDATE3: OpenCV is finally available for Python3, and is claimed to run on Linux, Win, and Mac! It makes it really easy, and I verfied that external software - in particular ffmpeg - is NOT needed!

First install OpenCV via Pip:

pip install opencv-python

Run in Python:

import cv2
cv2video = cv2.VideoCapture( videofilename)
height = cv2video.get(cv2.CAP_PROP_FRAME_HEIGHT)
width  = cv2video.get(cv2.CAP_PROP_FRAME_WIDTH) 
print ("Video Dimension: height:{} width:{}".format( height, width))

framecount = cv2video.get(cv2.CAP_PROP_FRAME_COUNT ) 
frames_per_sec = cv2video.get(cv2.CAP_PROP_FPS)
print("Video duration (sec):", framecount / frames_per_sec)

# equally easy to get this info from images
cv2image = cv2.imread(imagefilename, flags=cv2.IMREAD_COLOR  )
height, width, channel  = cv2image.shape
print ("Image Dimension: height:{} width:{}".format( height, width))

I also needed the first frame of a video as an image, and used ffmpeg for this to save the image in the file system. This also is easier with OpenCV:

hasFrames, cv2image = cv2video.read()   # reads 1st frame
cv2.imwrite("myfilename.png", cv2image) # extension defines image type

But even better, as I need the image only in memory for use in the PyQt5 toolkit, I can directly read the cv2-image into an Qt-image:

bytesPerLine = 3 * width
# my_qt_image = QImage(cv2image, width, height, bytesPerLine, QImage.Format_RGB888) # may give false colors!
my_qt_image = QImage(cv2image.data, width, height, bytesPerLine, QImage.Format_RGB888).rgbSwapped() # correct colors on my systems

As OpenCV is a huge program, I was concerned about timing. Turned out, OpenCV was never behind the alternatives. I takes some 100ms to read a slide, all the rest combined takes never more than 10ms.

I tested this successfully on Ubuntu Mate 16.04, 18.04, and 19.04, and on two different installations of Windows 10 Pro. (Did not have Mac avalable). I am really delighted about OpenCV!

You can see it in action in my SlideSorter program, which allows to sort images and videos, preserve sort order, and present as slideshow. Available here: https://sourceforge.net/projects/slidesorter/

Are you saying you don't want to use [tag:pyqt4]? You could use [Phonon.MediaObject.metaData](http://pyqt.sourceforge.net/Docs/PyQt4/phonon-mediaobject.html#metaData) — Peter Wood, Nov 23 '17 at 11:50
If you're on windows, see [this question](https://stackoverflow.com/questions/31507038/python-how-to-read-windows-media-created-date-not-file-creation-date). — Peter Wood, Nov 23 '17 at 12:02
I am on Linux, but would like to see something cross-platform. — ullix, Nov 23 '17 at 15:12
I already use PyQt4, so it would be natural to take advantage of it. But as far as can see the Metadata have a lot of, well, Metadata, but nothing as mundane as width and height and duration (https://xiph.org/vorbis/doc/v-comment.html ). The 'Phonon.VideoPlayer()' is easy to use and works well, but I can't find any of the info I am looking for. — ullix, Nov 23 '17 at 15:16
Phonon [delegates to GStreamer on Linux](http://pyqt.sourceforge.net/Docs/PyQt4/phonon-module.html#video). See [Getting started with GStreamer with Python](http://www.jonobacon.com/2006/08/28/getting-started-with-gstreamer-with-python/) — Peter Wood, Nov 23 '17 at 15:44
What about hachoir? I'm looking to do the same thing, and feel like hachoir may be the best option - seems to be a generic metadata parser: https://stackoverflow.com/a/26350426/587938 has an example. I *think* all the variables you're looking for should be in the file's metadata stream in most cases. — Nick, Nov 24 '17 at 00:06
I thought PyQt4 is complicated, but then I hadn't seen GStreamer yet :-( And in None of the many examples I went to did I see someone extracting video dimensions. — ullix, Nov 24 '17 at 15:38
Your answer (UPDATE3) shouldn't be part of the question, it should be an actual answer. You can even accept it if you want. — Mark Ransom, Nov 09 '22 at 16:38

Nick · Answer 1 · 2017-11-24T21:25:58.097

OK, after investigating this myself because I needed it too, it looks like it can be done with hachoir. Here's a code snippet that can give you all the metadata hachoir can read:

import re
from hachoir.parser import createParser
from hachoir.metadata import extractMetadata

def get_video_metadata(path):
    """
        Given a path, returns a dictionary of the video's metadata, as parsed by hachoir.
        Keys vary by exact filetype, but for an MP4 file on my machine,
        I get the following keys (inside of "Common" subdict):
            "Duration", "Image width", "Image height", "Creation date",
            "Last modification", "MIME type", "Endianness"

        Dict is nested - common keys are inside of a subdict "Common",
        which will always exist, but some keys *may* be inside of
        video/audio specific stream subdicts, named "Video Stream #1"
        or "Audio Stream #1", etc. Not all formats result in this
        separation.

        :param path: str path to video file
        :return: dict of video metadata
    """

    if not os.path.exists(path):
        raise ValueError("Provided path to video ({}) does not exist".format(path))

    parser = createParser(path)
    if not parser:
        raise RuntimeError("Unable to get metadata from video file")

    with parser:
        metadata = extractMetadata(parser)

        if not metadata:
            raise RuntimeError("Unable to get metadata from video file")

    metadata_dict = {}
    line_matcher = re.compile("-\s(?P<key>.+):\s(?P<value>.+)")
    group_key = None  # group_key stores which group we're currently in for nesting subkeys
    for line in metadata.exportPlaintext():  # this is what hachoir offers for dumping readable information
        parts = line_matcher.match(line)  #
        if not parts:  # not all lines have metadata - at least one is a header
            if line == "Metadata:":  # if it's the generic header, set it to "Common: to match items with multiple streams, so there's always a Common key
                group_key = "Common"
            else:
                group_key = line[:-1]  # strip off the trailing colon of the group header and set it to be the current group we add other keys into
            metadata_dict[group_key] = {}  # initialize the group
            continue

        if group_key:  # if we're inside of a group, then nest this key inside it
            metadata_dict[group_key][parts.group("key")] = parts.group("value")
        else:  # otherwise, put it in the root of the dict
            metadata_dict[parts.group("key")] = parts.group("value")

    return metadata_dict

This seems to return good results for me right now and requires no extra installs. The keys seem to vary a decent amount by video and type of video, so you'll need to do some checking and not just assume any particular key is there. This code is written for Python 3 and is using hachoir3 and adapted from hachoir3 documentation - I haven't investigated if it works for hachoir for Python 2.

In case it's useful, I also have the following for turning the text-based duration values into seconds:

def length(duration_value):

    time_split = re.match("(?P<hours>\d+\shrs)?\s*(?P<minutes>\d+\smin)?\s*(?P<seconds>\d+\ssec)?\s*(?P<ms>\d+\sms)", duration_value)  # get the individual time components

    fields_and_multipliers = {  # multipliers to convert each value to seconds
        "hours": 3600,
        "minutes": 60,
        "seconds": 1,
        "ms": 1
    }

    total_time = 0
    for group in fields_and_multipliers:  # iterate through each portion of time, multiply until it's in seconds and add to total
        if time_split.group(group) is not None:  # not all groups will be defined for all videos (eg: "hrs" may be missing)
            total_time += float(time_split.group(group).split(" ")[0]) * fields_and_multipliers[group]  # get the number from the match and multiply it to make seconds


    return total_time

Well, that looks like we are getting there. But look at that RegEx jungle and all the caveats! A little bit of json might do wonders. — ullix, Nov 24 '17 at 15:43
I found opencv could give the answer as a 3-liner, were it not for the problem that _"MacOS and Linux packages do not support video related functionality (not compiled with FFmpeg)."_ — ullix, Nov 24 '17 at 15:48
Could you clarify your comment about using JSON? The text returned from hachoir lacks structure and the dict this function returns is effectively a cleaner representation similar to data loaded from JSON. I initially implemented this with splits instead of regexes, but as I encountered more edge cases, it made more sense to use regexes — Nick, Nov 24 '17 at 21:22
My json comments were directed to the hachoire folks, and not to your regexes. I am wondering what the value of „- Image height: 1080 pixels“, or of „- Comment: User volume: 100.0%“ is in programs? Wouldn‘t I rather need the numbers itself, like 1080 or 100% (as numbers, not as strings)? It takes additional effort to extract the numbers. — ullix, Nov 25 '17 at 09:45
Your code could be made more 'jsonic' to me by replacing `return metadata_dict` with `return json.dumps(metadata_dict, indent=4)` - works just as well for pretty printing as for processing. It still leaves the need for extracting the numbers, which I did with if ... elif ...else statements within your for loop. Not nice but a workaround to the hachoir limits. — ullix, Nov 25 '17 at 09:53
After some more fiddling I am underwhelmed with the hachoir stuff. Depending on the video the keys like width, height and duration may be under completely different headers, which makes a dictionary – be it json compliant or not – not very useful. In addition, some keys come in duplicates, even within the same header, resulting in overwriting of the first one. Eventually I went to string search functions and string splitting to extract width, height and duration. Not sure that is reliable? (Note: your def length fails when no „ms“ is present; a missing „?“ at the end of the match string?) — ullix, Nov 25 '17 at 14:45
Thanks for the heads up on the length failure. I hadn't encountered any that were missing ms yet - I'll have to correct that. I definitely agree that the approach is limited - it works for my needs right now, but sounds like it doesn't for yours. Definitely not for more complex use cases because it'll require repeated looping through the main keys to check for values. — Nick, Nov 25 '17 at 23:25

master_q · Answer 2 · 2018-01-16T10:05:22.143

Mediainfo is another choice. cross platform together with MediaInfoDLL.py and Mediainfo.DLL library Download Mediainfo.dll from their site, CLI package to get DLL or both files including python script from https://github.com/MediaArea/MediaInfoLib/releases

working in python 3.6: you create dict of parameters you want, keys have to be exact but values will be defined later, it is just to be clear what the value might be

from MediaInfoDLL import *

# could be in __init__ of some class
    self.video = {'Format': 'AVC', 'Width': '1920', 'Height':'1080', 'ScanType':'Progressive', 'ScanOrder': 'None', 'FrameRate': '29.970',
                                  'FrameRate_Num': '','FrameRate_Den': '','FrameRate_Mode': '', 'FrameRate_Minimum': '', 'FrameRate_Maximum': '',
                                  'DisplayAspectRatio/String': '16:9', 'ColorSpace': 'YUV','ChromaSubsampling': '4:2:0', 'BitDepth': '8',
                                  'Duration': '', 'Duration/String3': ''}
    self.audio = {'Format': 'AAC', 'BitRate': '320000', 'BitRate_Mode': 'CBR', 'Channel(s)': '2', 'SamplingRate': '48000', 'BitDepth': '16'}

#a method within a class:

   def mediainfo(self, file):
        MI = MediaInfo()
        MI.Open(file)
        for key in self.video:
            value = MI.Get(Stream.Video, 0, key)
            self.video[key] = value
        for key in self.audio:
            # 0 means track 0
            value = MI.Get(Stream.Audio, 0, key)
            self.audio[key] = value
        MI.Close()   
    .
    .
    #calling it from another method:
    self.mediainfo(self.file) 
    .
# you'll get a dict with correct values, if none then value is ''
# for example to get frame rate out of that dictionary:
fps = self.video['FrameRate']

Thanks. In Ubuntu Linux Mate 16.04 you had to install: Py2: python-mediainfodll Py3: python3-mediainfodll and then import as: Py2: MediaInfoDLL Py3: MediaInfoDLL3 Will give it a try. — ullix, Jan 17 '18 at 10:22
Quite odd: while the import can de done, and the MI.Open is ok, in Py2 each "value" is empty, and in Py3 the MI.Get command always produces an error. Tried with an mp4 and a mov file. — ullix, Jan 17 '18 at 11:07
Did not play with it on Linux yet, will soon though. On Windows I have imported MediaInfoDLL.py (MediaInfoDLL3.py is exactly the same, not needed) and I have Mediainfo.DLL included in working directory. — master_q, Jan 17 '18 at 22:51
On (Ubuntu)Linux you must import MediaInfoDLL on Py2 and *3 on Py3, or you will get an importerror. But it does not work. MediaInfo is installed — ullix, Jan 19 '18 at 09:50
But using pymediainfo ("from pymediainfo import MediaInfo"), which I understand is a py wrapper to MediaInfo, does work. It shows that MediaInfo is accessible. However, for my needs of getting duration, width and height it does take >4 times longer than the ffprobe approach shown in the initial post. And still needs the external installation of MediaInfo — ullix, Jan 19 '18 at 09:57

Getting video properties with Python without calling external software

2 Answers2

Linked