4

I have a command line utility (exiftool) which accepts input via stdin.

Calling it from python might look like this:

ps = Popen(['exiftool','-groupNames','-json', '-'], stdin=PIPE, stdout=PIPE)

Where the pipe is used via:

with open(ffile, 'r') as fh:
    ps.stdin.write(fh.read())
ps.stdin.close()
print ps.stdout.read()
ps.wait()

As expected, this outputs the result of running exiftool with the contents of ffile passed as an argument.

I can call this code repeatedly in a loop, but it results in a fork for every call and is actually slow (this is not a case of premature optimization).

So I'm wondering if there is a way to open exiftool once, and then "re-use" Popen, piping multiple files in to it, and saving the output for each one.

It doesn't seem like it is possible, because exiftool (unlike cat) seems to interpret its input as an entire chunk, instead of line by line or according to some delimiter. But perhaps it is possible by hacking the exiftool process's stdin?

  • 2
    It seems that exiftool is not only a command line tool, it is also a perl library. You could perhaps write a perl script that waits for your program's input and directly calls the library functions. – Gribouillis Jan 01 '17 at 08:39
  • 1
    Or, use a Python exif library rather than forking out – Alastair McCormack Jan 01 '17 at 08:40
  • Does exiftool accept multiple files on its stdin? – Alastair McCormack Jan 01 '17 at 08:42
  • @AlastairMcCormack Yes, via David Z's answer, you can see how PyExifTool leverages the `-execute` parameter here: https://github.com/smarnach/pyexiftool/blob/master/exiftool.py#L200. This is still a little opaque but works. –  Jan 01 '17 at 15:44
  • @Gribouillis My application is written in Python so using Perl is a no-go. –  Jan 01 '17 at 15:48
  • Unless you really need a feature of PyExifTool, a pure Python library, such as https://pypi.python.org/pypi/ExifRead is the better option as it'll make your code more portable – Alastair McCormack Jan 01 '17 at 16:13

1 Answers1

3

There is a library PyExifTool which does exactly this: it runs exiftool in batch mode to extract metadata from any number of files using a single forked process. As a bonus, the library calls will parse the metadata for you.

Alternatively, you could forego exiftool entirely and use a pure-Python image manipulation library to read the EXIF data. Recommending a library is out of scope for Stack Overflow but nevertheless there is a closed question where you can find some options. Keep in mind that the question is seven years old, so you should do your own checking into the current validity of the answers.

Community
  • 1
  • 1
David Z
  • 128,184
  • 27
  • 255
  • 279
  • PyExifTool worked for me and eliminates the overhead of multiple forks: **"Since exiftool is run in batch mode, only a single instance needs to be launched and can be reused for many queries. This is much more efficient than launching a separate process for every single query."** –  Jan 01 '17 at 15:44