3

I want to store user-defined segments. A segment will consist of several different rules. I was thinking I would either create a separate separate table of "Rules" with three columns: attribute name, operator, and value. For example, if a Segment is users in the united states the rule would be "country = US" in their respective columns. A segment can have many rules.

The other option is to store these as JSONB via Postgres in a "Rules" column in the Segment table. I'd follow a similar pattern to the above with an array of rules or something. What are the pros and cons of each method?

Maybe neither one of these is the right approach.

Boris Schegolev
  • 3,601
  • 5
  • 21
  • 34
mergesort
  • 5,087
  • 13
  • 38
  • 63
  • Here is an example where the rules are stored as jsonb array of json. see the accepted answer for the full structure of the table as well as the queries to run: http://stackoverflow.com/questions/38679190/query-jsonb-column-containing-array-of-json-objects – brg Aug 29 '16 at 21:27

2 Answers2

6

The choice is basically about the way you wish to read the data.

You are better off with JSON if:

  • you are not going to filter (with a WHERE clause) through the Rules
  • you do not need to get statistics (i.e. GROUP BY)
  • you will not imply any constraints on attributes/operators/values
  • you simply select the values (SELECT ..., Rules)

If you meet these requirements you can store data as JSON, thus eliminating JOINs and subselects, eliminating the overhead of primary key and indexes on Rules, etc.

But if you don't meet these you should store the data in a common relational design - your approach 1.

Boris Schegolev
  • 3,601
  • 5
  • 21
  • 34
  • I'm also not sure if the schema of rules will remain the same. For example, right now I would make the schema for rules "Name, Value, Operator", but what if the operator is between and I need two values? I think JSONB gives me that flexibility, right? Is that what you meant by "you will not imply any constraints on attributes/operators/values"? – mergesort Aug 29 '16 at 17:43
  • 1
    That's exactly what I meant. You are free to store any semi-structural data in JSON. – Boris Schegolev Aug 29 '16 at 17:54
  • Thanks! If I'm using this functionality of Postgres, would it maybe make sense to use a schema-less database or something along those lines? Perhaps DynamoDB? – mergesort Aug 29 '16 at 20:07
  • I would not go that far :) Relational databases have great advantages when you need to deal with the 4 bullets I mentioned in the answer, and in many other cases I did not mention. I personally stick with PostgreSQL as a full-featured object-relational database, and use JSONB when I don't need the "strict" schema. – Boris Schegolev Aug 29 '16 at 20:16
2

I would go with the first approach of storing the pieces of data individually in a relational database. It sounds like your data (segments->rules) will always contain the same structure (which is fairly simple), so there isn't a pressing reason to store the data as JSON.

As a side note, I think you will need another column in the "Rules" table, serving as a foreign key to the "Segments" table.

Pros to approach 1:

  • Data is easy to search and select. Your SQL statements can directly access specific information about the rules (the specific operators, name, value, etc) without having to parse the JSON object for the desired rule.
  • The above will result in reduced processing time
  • Only need to parse the JSON once (before the insert)

Cons to approach 1:

  • Requires parsing of JSON before the insert
  • Requires multiple inserts per segment

Regarding your last sentence, it is hard to prescribe a database design without knowing more about your intended functionality. For example, if the attribute names have meaning beyond a single segment, you would want to store the attribute names separately and reference them in the Rules table.

Michael
  • 425
  • 2
  • 9
  • The idea is to query other data based on segments. So these rules will compose segments, and then I will serve data based on which segment a user is in. Does that make sense? – mergesort Aug 29 '16 at 17:44