0

I'll try and simplify as much as possible, however, if you need more information please let me know.

I'm using Rails 4 and PostgreSQL

edit:

  • using PSQL 9.3
  • the dataset won't change often and for this particular table will probably only have 15 columns

I have a design where there are "core" components that have default attribute values like:

  • material = wood
  • color = blue
  • price = $1.52
  • dimensions = 3x2x5

These "core" components and their default attribute values are managed by an admin who can make adjustments through an admin interface as needed.

A user can create a new component_group and it will pre-populate with available components. The components in the new group all use the default attribute values of their "core" component.

A user can then modify the attribute values of any of the components that the group contains.

What I currently do is: duplicate each "core" component to create a new unique record with the identical attribute values of the "core".

My concern is, that this app will potentially create a HUGE amount of records; many of those records may not have their default attribute values changed. While I don't know definitively, this seems like it's going to eventually be a performance problem (especially when you consider that in the real world scenario, components will have their own relations which may need to be duplicated as well).

My initial thought was to implement some kind of system where a new component record is only created if it's attribute values are changed, otherwise the component_group referrers to the "core" component.

So my questions are:

  1. Is my current approach even remotely correct?
  2. Are my performance concerns valid, or will it be insignificant to the DB?
  3. Would this type of functionality be better suited to a NoSQL DB like CouchDB?
  4. Is there a specific name for this type of functionality? I've looked at Class-Table Inheritance / Multi-Table Inheritance but I don't think that's what I'm looking for.
Community
  • 1
  • 1
Eric Norcross
  • 4,177
  • 4
  • 28
  • 53
  • Are you dealing with a small, stable set of attributes? As opposed to a huge, dynamic set. Provide your table definitions (`\d components` in psql) and (as *always*) your version of Postgres. It's hard to optimize a solution that's barely defined. – Erwin Brandstetter Jun 24 '15 at 02:08

1 Answers1

0

You can use (mostly) identical table definitions and NULL values in the child table to default to the respective column value of the parent row. Code example:

CREATE TABLE comp_template (  -- parent table 
  comp_template_id serial PRIMARY KEY
, material_id      int REFERENCES material
, color            enum
, ...           -- attributes may or may not be defined NOT NULL
);

CREATE TABLE comp_group (  -- container
  comp_group_id serial PRIMARY KEY
, comp_group    text NOT NULL
)

CREATE TABLE comp (  -- child table 
  comp_id          serial PRIMARY KEY
, comp_group_id    int NOT NULL REFERENCES comp_group ON UPDATE CASCADE
                                                      ON DELETE CASCADE
, comp_template_id int NOT NULL REFERENCES comp_template ON UPDATE CASCADE
, material_id      int REFERENCES material
, color            enum
, ...           -- like comp_template, but all attributes can be NULL

A view returning effective values:

CREATE VIEW comp_effective AS
SELECT c.comp_id, c.comp_template_id
     , COALESCE(c.material_id, t.material_id) AS material_id
     , COALESCE(c.color, t.color) AS color
     , ...
FROM   comp c
JOIN   comp_template t USING (comp_template_id);

NULL storage is very cheap:

This is assuming that you have a small, mostly static set of possible attributes. The solution is efficient up to a couple of hundred distinct attributes (columns) where you don't add another attribute every day. Else look to unstructured data types like hstore or jsonb

You could use inheritance between comp_template and comp, would make sense. But consider limitations of the Postgres implementation first.

Related answer with more details:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Could the `comp` table be self-referential thereby eliminating the need for the `comp_template` table? – Eric Norcross Jun 24 '15 at 03:04
  • it will be mostly static and attributes will be added infrequently if at all. I'm using PSQL 9.3. Also, I wanted to note that I'm not really familiar with PSQL beyond what's necessary to know for RoR – Eric Norcross Jun 24 '15 at 03:16
  • @greetification: Yes, could be self-referential. This way you could even reference recursively (which would make the view more complex). But since you mentioned that component templates are entered by admins, permission management will be easier with a separate table. To sum it up: fewer options, but simpler and faster with a dedicated template table. – Erwin Brandstetter Jun 24 '15 at 03:35
  • so, apart from recursive referencing, is there any advantage to making it self referential? – Eric Norcross Jun 24 '15 at 04:08
  • Also, just to clarify, is the `ON * CASCADE` bit the important part for making this work? – Eric Norcross Jun 24 '15 at 04:09
  • Advantage of single table: One less table (you would add a tag to tell both types apart), more flexible, like: you can "upgrade" a plain component to a template by just changing its type. But also more pitfalls. If in doubt, go with separate tables. `ON UPDATE / DELETE CASCADE` is not essential. Explanation in this related answer: http://stackoverflow.com/questions/9789736/how-to-implement-a-many-to-many-relationship-in-postgresql/9790225#9790225 – Erwin Brandstetter Jun 24 '15 at 04:14