3

I'm working on an application for which I need to be able to store EXIF metadata in a relational database. In the future I'd like to also support XMP and IPTC metadata, but at the moment my focus is on EXIF.

There are a few questions on Stack Overflow about what the table structure should look like when storing EXIF metadata. However, none of them really address my concern. The problem I have is that different EXIF tags have values in different formats, and there isn't really one column type which conveniently stores them all.

The most common type is a "rational" which is an array of two four-byte integers representing a fraction. But there are also non-fractional short and long integers, ASCII strings, byte arrays, and "undefined" (an 8-bit type which must be interpreted according to a priori knowledge of the specific tag.) I'd like to support all of these types, and I want to do so in a convenient, efficient, lossless (i.e. without converting the rationals to floats), extensible and searchable manner.

Here's what I've considered so far:

  • My current solution is to store everything as a string. This makes it pretty easy to store all of the different types, and is also convenient for searching and debugging. However, it's kind of clunky and inefficient because when I want to actually use the data, I have to do a bunch of string manipulation to convert the rational values into their fractional equivalents, e.g. fraction = float(value.split('/')[0]) / float(value.split('/')[1]). (It's not actually a messy one-liner like that in my real code, but this demonstrates the problem.)

  • I could grab the raw EXIF bytes for each value from the file and store them in a blob column, but then I'd have to reinterpret the raw bytes every time. This could be marginally more CPU-efficient than the string solution, but it's much, much worse in every other way - on the whole, not worth it.

  • I could have a different table for each different EXIF datatype. Using this pattern I can maintain my foreign key relationships while storing my values in several different tables. However, this will make my most common query, which is to select all EXIF metadata for a given photo, kind of nasty. It will also become unwieldy very quickly when I add support for other metadata formats.

I'm not a database expert by any means, so there some pattern or magic union-style column type I'm missing that can make this problem go away? Or am I stuck picking my poison from among the three options above?

Community
  • 1
  • 1
Mitch Lindgren
  • 2,120
  • 1
  • 18
  • 36

1 Answers1

1

This is probably a very cheap solution, but I would personally just store the json or something like that within the database.

There is a cool way to extract EXIF data and parse it to json.

Here is the link: Img2JSON

I hope this kind of helps you!

Wurzelgogerer
  • 303
  • 2
  • 6
  • Thanks for the suggestion, but this is not a searchable solution, so it doesn't have many advantages over just dumping the entire EXIF section of the JPEG into the database as a blob. – Mitch Lindgren Oct 19 '12 at 04:33