-2

Let's say I have a canvas where there are various objects I can add in, such as a/an:

  • Drawing
  • Image
  • Chart
  • Note
  • Table

For each object I need to store the dimensions and the layer order, for example something like this:

  • ObjectID
  • LayerIndex
  • Dimensions ((x1, y1), (x2, y2))

Each of the objects have vastly different properties and so are stored in different tables (or classes or whatever). Would it be possible to store this into a relational database, and if so, how could it be done? In JSON it would be something like this:

// LayerIndex is the ArrayIndex
// No need to store ObjectID, since the object is stored within the array itself
Layers = [
    {Type: Drawing, Props: <DrawingPropertyObj>, Dimensions: [(1,2), (3,4)]},
    {Type: Chart,   Props: <ChartPropertyObj>,   Dimensions: [(3,4), (10,4)]},
    {Type: Table,   Props: <TablePropertyObj>,   Dimensions: [(10,20), (30,44)]},
    ...
]

The one option I thought of is storing a FK to each table, but in that case, I could potentially join this to N different tables for each object type, so if there are 100 object types, ...

David542
  • 104,438
  • 178
  • 489
  • 842
  • The [entity–attribute–value model](https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model) would work to store different attributes for your grid objects. – Gilbert Le Blanc Nov 22 '21 at 03:20
  • This is fully answered in [this](https://stackoverflow.com/questions/4304217/database-schema-which-can-support-specialized-properties/4359193#4359193) thread and the linked threads therein – Giannis Tsim Dec 01 '21 at 07:16
  • do you need to query the db by the object properties? – kaznovac Dec 02 '21 at 13:17
  • This is a common duplicate. [How can you represent inheritance in a database?](https://stackoverflow.com/q/3579079/3404097) Etc. [How much research effort is expected of Stack Overflow users?](https://meta.stackoverflow.com/q/261592/3404097) – philipxy Dec 05 '21 at 00:24
  • @philipxy understood. I take it you saw that that question has the tag `sql-server` (and also the duplicate you marked from that question), yes? And none of the answers address anything about a potential non-relational way of doing it outside of a one-sentence answer that literally reads, in its entirety, `Alternatively, consider using a document databases (such as MongoDB) which natively support rich data structures and nesting.`. – David542 Dec 05 '21 at 04:41
  • What SQL DBMS doesn't matter (as is clear from the answers) (although some have special polymorphism functionality) and you don't tag one and your question asks about SQL The topics are SQL/DB subtypes/inheritance/polymorphism. The design options are also often offered in answers to questions re design anti-patterns along the lines of many/multiple FKs/references to many/multiple tables. Generally design for enriched SQL types & non-SQL DBMSs begin with relational design then rearrange structure for desired queries. – philipxy Dec 05 '21 at 06:23

2 Answers2

1

You have many options as shown below.

There is not much difference in which one you pick, but I would avoid the multi-table design which is the one you said. An object type with 100 properties would be scattered in 101 tables for no gain. 101 disk page accesses for each object type being read. That's unnecessary (if those pages are cached then this problem would be lesser than otherwise but is still waste).

Even dual table is not really necessary if you don't wish to filter things like 'all objects with color=red', but I guess performance is not so urgent to reach to this point, other things matters more, or other bottlenecks have more influence in performance, so pick the one of the no-more-than-dual-table that fits best for you.

Single table - flexible schema per object type

objlayerindex type props x0 y0 x1 y1
0 drawing {color:#00FF00,background-color:#00FFFF} 1 2 3 4
1 chart {title:2021_sales,values:[[0,0],[3,4]]} 11 22 33 44
  • in props the keys are used for flexibility, different objects of the same type may have different keys, for example a chart without subtitle can omit this key.

Single table - fixed schema per object type

objlayerindex type props x0 y0 x1 y1
0 drawing #00FF00,#00FFFF 1 2 3 4
1 chart 2021_sales,"[[0,0],[3,4]]" 11 22 33 44
  • this schema is fixed - drawing always has color+backgroundcolor; chart always have title+values; etc - less space used but changing schema involves some work on already existing data.

Dual table

Main
objlayerindex type x0 y0 x1 y1
0 drawing 1 2 3 4
1 chart 11 22 33 44
Properties
objlayerindex propertyname propertyvalue
0 color #00FF00
0 background-color #00FFFF
1 title 2021_sales
1 values [[0,0],[3,4]]
  • here we assume that property ordering is not important. If it is, an extra column propertyindex would be needed. For those who love normalization, it is possible also to take propertyname out of this table to a propertykey-propertydescription and reference it by its propertykey.

Multi table

Main
objlayerindex type x0 y0 x1 y1
0 drawing 1 2 3 4
1 chart 11 22 33 44
Color
objlayerindex colorcode
0 #00FF00
Background-Color
objlayerindex colorcode
0 #00FFFF
Title
objlayerindex title
1 2021_sales
Values
objilayerindex chart
1 [[0,0],[3,4]]
  • Specifically this kind of data can be normalized one extra level:
Values
objlayerindex datapoint x y
1 0 0 0
1 1 3 4

You can also use non-relational formats.

Document (Json) Store

[
  {type:drawing,props:{color:#00FF0,background-color:#00FF0},position:[1,2,3,4]},
  {type:chart,props:{title:2021_sales,values:[[0,0],[3,4]]},position:[11,22,33,44]}
]
  • we are citing here because it is a popular and simple format, but different encodings can be used instead of JSON (CSV, protocolbuffers, avro, etc)
brunoff
  • 4,161
  • 9
  • 10
1

A "strict" relational database doesn't suit this task well because you're left with a choice of:

  1. Different tables for each object type with a columns for each attribute that applies to that particular object type
  2. A single table for all object types, with columns for each attribute, most of which aren't used for any given object type
  3. A child table, one row for each attribute

Before moving on to a good general solution., let's discuss these:

1. Different tables for each object type

This is a non-starter. The problems are:

  • high maintenance cost: you must create a new table every time you add a new object type to your app
  • painful queries: you must join to every table, either horizontally - every table joined into one enormously long row, or vertically in a series unioned joins, leading to a sparse array (see option 2)

2. A single table for all object types

Although you're dealing with a sparse array, if most object types use most of the attributes (ie it's not that sparse), this is a good option. However, if the number of different attributes across your domain is high, and/or most attributes aren't used by all types, you have to add columns when introducing a new type, which although better than adding tables, still requires a schema change for a new type = high maintenance

3. A child table

This is the classic approach, but the worse to work with, because you either have to run a separate query to collect all the attributes for each object (slow, high maintenance), or write separate queries for each object type, joining to the child table once for each attribute to flatten out the many rows into one row for each object, effectively resulting in option 1, but with an even higher maintenance cost writing the queries

None of those are great options. What you want is:

  • One row per object
  • Simple queries
  • Simple schema
  • Low maintenance

A document database, such as Elasticsearch gives you all of this out of the box, but you can achieve the same effect with a relational database by relaxing "strictness" and saving the whole object as json in a single column:

create table object (
  id int, -- typically auto incrementing
  -- FK to parent - see below
  json text -- store object as json
);

BTW, postgres would be a good choice, because it has native support for json via the json datatype.

I have used this several times in my career, always successfully. I added a column for the object class type (in a java context):

create table object (
  id int,
  -- FK to parent - see below
  class_name text,
  json text
);

and used a json library to deserialize the json using the specified class into an object of that class. Whatever language you're using will have a way of achieving this idea.

As for the hierarchy, a relational database does this well. From the canvas:

create table canvas (
  id int,
  -- various attributes
);

If objects are not reused:

create table object (
  id int,
  canvas_id int not null references canvas,
  class_name text,
  json text,
  layer int not null
);   

If objects are reused:

If objects are not reused:

create table object (
  id int,
  class_name text,
  json text
);

create table canvas_object (
  canvas_id int not null references canvas,
  object_id int not null references object,
  layer int not null
);
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • 1
    speaking of postgres, here is a neat little package/script that makes use of a lot of the concepts you mentioned: https://github.com/solidsnack/pg-sql-variants – David542 Dec 05 '21 at 04:45
  • 1
    @David542 You could use class hierarchy, but I've always tied myself up in knots when attempting it. IMHO the dummer something is, the easier it is to work with, and the simpler, ie better, the end solution is. If it's easy to understand, it's easy to use well and maintain. – Bohemian Dec 05 '21 at 05:38
  • you mention at the start `A "strict" relational database doesn't suit this task well`, if you were tasked with storing this data, would you use a relational or non-relational DB? And, if the latter, would you want to include that as a section in your answer as well? – David542 Dec 05 '21 at 18:17
  • 1
    I would use relational because of the extra bells & whistles (such as transactional boundaries, referential integrity, views, etc), but in a "document" style - meaning it would violate [first normal form](https://en.wikipedia.org/wiki/First_normal_form) due to a single column storing an entire entity. I have done it using postgres, Lucene (actually Solr) and Elasticsearch - all work OK. Note that RDBMSs are [ACID](https://en.wikipedia.org/wiki/ACID), but document dbs are not (they are [eventually consistent](https://en.wikipedia.org/wiki/Eventual_consistency), which may not suit your purpose. – Bohemian Dec 05 '21 at 22:46