2

I am trying to store "dynamic" properties about objects in SQL. As an example, let's say I have a table called objs that has two columns (id, name). Now some users may want to store a property called hocus while others may want to store a property called pocus (or even banana). Anything really.

My thought is to create two tables, props and obj_props. props would have two columns (id and prop_name), and obj_props would have (obj_id, prop_id, and value).

My only concern is this seems like a lot of overhead if there are millions of objects, each with 20-30 properties. I know I could create an index in obj_props on obj_id and prop_id but will this still be able to perform well? Is there a better solution for something like this? I'm looking into MongoDB but the lack of joins is frustrating.

Leah Sapan
  • 3,621
  • 7
  • 33
  • 57
  • 1
    I'd recommend to search nosql db that fits well to your application reqiremnts. The method you've described is a variant of common EAV antipattern. It can be used for small amount of dynamic data, but it performs really bad when you have millions of objects. – Konstantin V. Salikhov Jan 28 '14 at 18:46
  • 1
    Have you looked at hstore? – mu is too short Jan 28 '14 at 19:26
  • Indeed we've been using hstore for some years now, it works very well. We're using some JSON also, since pg9.3. – Drasill Jan 28 '14 at 19:32
  • I hadn't heard of hstore before but after doing some research, that looks like the ultimate way to go, thanks! – Leah Sapan Jan 28 '14 at 21:39

2 Answers2

4

First, you should start with a proper database schema (using standard data model patterns) so you can avoid this as much as possible.

Martin Fowler recommends using either a serialized LOB (such as JSON or XML), or allowing the user to edit their own database schema (which is my preferred method):

http://martinfowler.com/bliki/UserDefinedField.html

Bill Karwin has a link in the comments on creating a second table to index values in the blob field

Neil McGuigan
  • 46,580
  • 12
  • 123
  • 152
  • 2
    See my presentation [Extensible Data Modeling with MySQL](http://www.slideshare.net/billkarwin/extensible-data-modeling). – Bill Karwin Mar 13 '14 at 20:53
4

This has been discussed repeatedly before:

The short version: EAV has its place, but it's often better to use json, XML, or hstore. PostgreSQL 9.4's enhanced json will probably become the most attractive choice, as it combines the advantages of json and hstore.

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • Yes I completely agree with you, JSON in 9.4 looks like the silver bullet. For now though, hstore will work for most of my use cases as I don't need nesting of the data. Thanks for your input! – Leah Sapan Jan 29 '14 at 12:40