My dataset is of the form of instances of series data, each with associated metadata. Similar to a CD collection where each CD track has metadata (artist, album, length, etc.) and a series of audio data. Or imagine a road condition survey dataset - each time a survey is conducted the metadata of location, date, time, operator, etc. is recorded, as well as some physical series data of the road condition for each unit length of road. The collection of surveys ({metadata, data} pairs) is the dataset.
I'd like to take advantage of pandas to help import, store, search and analyse that dataset. pandas does not have built-in support for this type of dataset, but many have tried to add it.
The typical solutions are either:
Add metadata to a pandas DataFrame, but this is the wrong way around - I want a collection of metadata records each with associated data, not data with associated metadata.
Casting data to be valid field in a DataFrame and storing it as one of the metadata fields, but the casting process discards significant integrity.
Using multiple indices to create a 3D DataFrame, but this imposes design details on your choice of index, which limits experimentation.
This sort of dataset is very common, and I see a lot of people trying to bend pandas to accommodate it. I wonder what the right approach is, or even if pandas is the right tool.