0

I'm starting work on a fairly complex project that will eventually be used to run multiple departments (HR, finance, etc.) as well as hold client data, etc. I've been handed a database design doc that explain how the previous individuals (who are no longer around) wanted to design the database. It's a design that I've never come across before. The simplified idea is that there is on table that holds actual data and a second table that combines that data into logical units.

For example, here would be the first table with data:

KEY, DATA
0,   Name
1,   First Name
2,   Last Name
3,   Position
4,   Title
5,   Reports To
6,   John
7,   Smith
8,   Developer
9,   CTO

And here would be the second table that combines the data:

KEY, GROUP_KEY, DATA_KEY, CHILD_KEY
0,   NULL,      0,        NULL
1,   0,         1,        2
2,   0,         2,        NULL
3,   NULL       3,        NULL
4,   3,         4,        5
5,   3,         5,        NULL
6,   0,         6,        7
7,   0,         7,        NULL
8,   3,         8,        9
9,   3,         9,        NULL

Basically, the first line in the second table (key 0) defines a new grouping that is called Name. The next two lines define the "columns" that belong to that group. The fourth line (key 3) defines another grouping that is called Position and again the following two lines define the "columns" that belong to the group. The last four lines "assign" the values John and Smith to the group Name and the values Developer and CTO to the group Position.

This is greatly simplified but I'm hoping it gives a basic idea of the proposed database structure. One table to hold all possible values that can exist in the database and a second table that combines these values into any and all possible combinations.

My new manager is not too keen on this design and personally I've never come across it. Since this will be a C# project using either the Entity framework or NHibernate for interacting with the database this kind of database design seems like it would be a challenge to implement in either of those frameworks. Is there a name for this kind of design so I can research it further? Are there any major pros or cons for this design? The documentation mentions that this is done for better performance and for "hyper" normalization.

philipxy
  • 14,867
  • 6
  • 39
  • 83
Harry Muscle
  • 2,247
  • 4
  • 38
  • 62
  • This is indeed a very odd design. It looks like someone tried to reinvent the Wheel/Views. It could also be lookup tables. However it might also be a nesseary design concession. We do not know anything about the hardware and hte specifics of the hardware just *might* make this nessesary. Not that I can think of any example reason. But overall I think views are the droid they should have been looking for. – Christopher Nov 19 '18 at 19:47
  • 1
    Google re EAV & its problems. [EAV is for when DDL is inadequate.](https://stackoverflow.com/a/32285603/3404097) [EAV implements a DBMS](https://stackoverflow.com/a/23950836/3404097) This seems like an EAV encoding of a table representing a relation(ship)/association that has certain tree-like structure of indefinite depth. They give tables that represent tables that represent a data structure that represents the applicaton instead of just giving tables that represent the application. – philipxy Nov 19 '18 at 20:10
  • PS "Basically" not introducing or summarizing an unequivocally clear presentation just means "unclearly". And putting something in quotes when not defining or quoting doesn't communicate the idiosyncratic non-standard probably fuzzy meaning you are using for it. – philipxy Nov 19 '18 at 20:14

1 Answers1

1

This looks like the Inner-Platform antipattern. You've got an RDBMS and, instead of normalizing your data into relations, you decide to use it like a flat file and dump each datum into a separate row, expecting to assemble the rows into relations at runtime by joining on magic key columns.

I have never seen this work well; performance is abysmal, there are no constraints so data is missing or duplicated all over the place, there are no key relations so no way to enforce validity, normal database operations like aggregates and grouping must be done manually with procedural logic. You're basically using a database to implement a database.

Your architects probably thought they were providing “expandability”. Look! You can add any data of any kind to the end of the database! This is rarely useful; it makes finding data and enforcing validity nearly impossible. If you really really need to add any data type at runtime, every SQL database since the 1970s has allowed dynamic SQL and altering the schema at runtime.

-1. Would not purchase again.

Dour High Arch
  • 21,513
  • 29
  • 75
  • 90
  • This pattern is used reliably and widely in analytics. In fact, what he is describing is a partial (the metadata is clear, but the data persistence is not) name-value construct in which an unknown number of columns can be joined at run time to construct a virtual row. Unfortunately, this construct, when used in OLTP will be a bottleneck for performance. Depending on the DB used, it is possible to build materialized views for this data, or a memcached version of a virtual table for usage. – T Gray Nov 27 '18 at 17:08
  • Oh I've used this pattern successfully myself for circular buffers, logs, caches, and others. But that's not what he's using it for; he's trying to store **relations** like “John Smith is a Developer” in a non-relational list. Which is then stored in a relational database, only not storing, you know, the actual relations. – Dour High Arch Nov 29 '18 at 02:28
  • What it comes down to is "are you trying to hammer a nail?" The crescent-hammer approach rarely yields good results. – T Gray Dec 11 '18 at 23:18