I have a number of TIFF files which contain descriptions and "keywords" (as OS X terms them in the file inspector). I'm having difficulty collecting this metadata from the images, however.
I've tried using tifffile.py, PIL's exif commands and IPTCInfo, and while tifffile.py will get the description I still can't seem to parse the "keywords" from the file using any of these libraries.
Are keywords stored using a different "specification" for TIFFs than for JPEGs? What would be the best approach to parse these keywords?
EDIT
Further to the comment from abarnert
, I opened one of the TIFF files in a text editor and found that there is XML data which contains the "keywords". Snippet below:
...
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">OLYMPUS DIGITAL CAMERA</rdf:li>
</rdf:Alt>
</dc:description>
<dc:format>image/tiff</dc:format>
<dc:subject>
<rdf:Bag>
<rdf:li>Foo</rdf:li>
<rdf:li>Bar</rdf:li>
<rdf:li>A long keyword</rdf:li>
</rdf:Bag>
</dc:subject>
</rdf:Description>
...
It looks as though this could be stored as a binary representation; tifffile.py lists a number of tags that are essentially tuples of integers. I'm not sure how to convert this, however. Suggestions?