0

I am working on a Python script to sort photos based on their date. I would like to include videos as well, but there is no metadata standard like EXIF for photos.

While testing, I noticed I could find my videos' date using bash:

$ head -c 1600 DSC_7643.AVI | strings
AVI LIST&
hdrlavih8
LISTt
...
NIKON
nctgr
NIKON CORPORATION
NIKON D90
A1.00
 B1.00
2012:10:30 09:38:16
2012:10:30 09:38:16

If I had this list, I could just iterate it looking for parseable dates and make a pretty good guess.

The man page for strings says: "find the printable strings in a object, or other binary, file". Unfortunately, this is a pretty hard description to search for something similar in Python, and I don't know exactly what it's doing to achieve its result. Is there a Python utility or library that can do something similar?

Nicole
  • 32,841
  • 11
  • 75
  • 101
  • possible duplicate of [Python equivalent of unix "strings" utility](http://stackoverflow.com/questions/17195924/python-equivalent-of-unix-strings-utility) – Zero Piraeus Aug 13 '13 at 05:36
  • When I read the manpage for strings on any of the Mac, linux, or FreeBSD boxes I have around, they all describe what `strings` actually does better than that. For example, in OS X 10.8: "A string is any sequence of 4 (the default) or more printing characters ending with a newline or a null… except in the (__TEXT,__text) section." – abarnert Aug 13 '13 at 05:56
  • Also, of course, the source code for GNU, *BSD, etc. binutils are all available online. For example, here's [FreeBSD](http://code.google.com/p/freebsd-head/source/browse/contrib/binutils/binutils/strings.c#545). That looks like a lot of code, but almost all of it is about how to do the various different address-printing formats. The actual code to find the next string is about 25 lines. – abarnert Aug 13 '13 at 06:00

1 Answers1

0

Rather than search for a Python alternative to strings, simply invoke the actual strings.

>>> lst=subprocess.check_output(('strings', 'DSC_7643.AVI')).split('\n')

It appears that AVI files do have a standard metadata encoding. The GNU program extract claims to do what you want:

>>> lst=subprocess.check_output(('extract', 'DSC_7643.AVI')).split('\n')

Additionally, there seems to be an API for extracting metadata. It even has a Python binding.

Robᵩ
  • 163,533
  • 20
  • 239
  • 308