0

The project I'm working on is an application that lets you design data entry forms, and automagically generates a schema in an underlying PostgreSQL database to persist them as well as the browsing and editing UI.

The use case I've encountered this with is a store back-office database, but the app itself intends to be somewhat universal. The administrator creates the following entry forms with the given fields:

  • Customers
    • name (text box)
  • Items
    • name (text box)
    • stock (number field)
  • Order
    • customer (combo box selecting a customer)
    • order lines (a grid showing order lines)
  • OrderLine
    • item (combo box selecting an item)
    • count (number field)

When all this is done, the resulting database schema will be equivalent to this:

create table Customers(id serial primary key,
                       name varchar);
create table Items(id serial primary key,
                   name varchar,
                   stock integer);
create table Orders(id serial primary key);
create table OrderLines(id serial primary key,
                       count integer);
create table Links(id serial primary key,
                   fk1 integer references Customers.id,
                   fk2 integer references Items.id,
                   fk3 integer references Orders.id,
                   fk4 integer references OrderLines.id);

Links being a special table that stores all the relationships between entities; every row has (usually) two of the foreign keys set to a value, and the rest set to NULL. Whenever a new entry form is added to the application instance, a new foreign key referencing the table for this form is added to Links.

So, suppose our shop stocks some widgets, gizmos, and thingeys. A customer named Adam orders two widgets and three gizmos, and Betty orders four gizmos and five thingeys. The database will contain the following data:

Customers

/----+-------\
| ID | NAME  |
| 1  | Adam  |
| 2  | Betty |
\----+-------/

Items

/----+---------+-------\
| ID | NAME    | STOCK |
| 1  | widget  | 123   |
| 2  | gizmo   | 456   |
| 3  | thingey | 789   |
\----+---------+-------/

Orders

/----\
| ID |
| 1  |
| 2  |
\----/

OrderLines

/----+-------\
| ID | COUNT |
| 1  | 2     |
| 2  | 3     |
| 3  | 4     |
| 4  | 5     |
\----+-------/

Links

/----+------+------+------+------\
| ID | FK1  | FK2  | FK3  | FK4  |
| 1  | 1    | NULL | 1    | NULL |
| 2  | 2    | NULL | 2    | NULL |
| 3  | NULL | NULL | 1    | 1    |
| 4  | NULL | NULL | 1    | 2    |
| 5  | NULL | NULL | 2    | 3    |
| 6  | NULL | NULL | 2    | 4    |
| 7  | NULL | 1    | NULL | 1    |
| 8  | NULL | 2    | NULL | 2    |
| 9  | NULL | 2    | NULL | 3    |
| 10 | NULL | 3    | NULL | 4    |
\----+------+------+------+------/

(The tables also contain a bunch of timestamps for auditing and soft deletion but I don't think they're relevant here, they just make writing the SQL by the administrator that much messier. The management app is also used to implement a bunch of different use cases, but they're generally primarily data entry, master-detail views, and either scalar fields or selection boxes.)

When I've had to write a join through this thing I'd grumbled about it to my coworker, who replied "well using separate tables for each relationship is one way to do it, this is another..." Leaving aside the obvious-to-me ugliness of the above and the practical issues, I also have a nagging feeling this has to be a violation of some normal form, but it's been a while since college and I'm struggling to figure out which of the criteria apply here.

Is there something stronger "well that's just your opinion" I can use when critiquing this design?

millimoose
  • 39,073
  • 9
  • 82
  • 134
  • There's no bad without better. Tell us how this & other stuff goes together to model something, so we can show that the something can be modeled better. Probably your description of the something will be just such a better design. This fragment is peculiar but doesn't tell us much. Tables in a relational database represent business relationships on values (some of which id entities). This presumably helps encode a straightforward design (tables) by a complex one. Eg as in [EAV](https://stackoverflow.com/a/23950836/3404097). (This is going to be so wrong that it won't matter how it's wrong. – philipxy Mar 15 '18 at 17:47
  • @philipxy - The FK attributes are foreign keys into other tables. None of the other tables have foreign keys between them. The app lets you basically design data entry forms, then generates a SQL schema to persist them and grid views to read them; my example is nonspecific because the design itself is. `links` contains all relationships between all tables thus created. If the form for an Apple contains a combo-box that lets you select an Orange, when you save this Apple, the app inserts a row where `fk1` is set to that Apple’s primary key, `fk2` to the Orange’s, all the other FKs are `null`. – millimoose Mar 15 '18 at 18:14
  • @philipxy When a new input form is added to the app, and a new table is created in the schema, a new FK column is added to `links`. One reason I can imagine for this design is that you don’t have to generate a FK column on a contained entity (or an association table) when a new relationship should actually be created in the schema - e.g. when you add a combobox like I described above - `links` can connect every table to every table. However I’m not sure why would an advantage if the app already changes the schema when you add controls to forms, one to warrant the schema losing explicitneds. – millimoose Mar 15 '18 at 18:26
  • Please edit clarifications seamlessly into your post. The design is not "nonspecific"--it does a certain something. Sounds like EAV though--programming your own semi-buggy-sorta-DBMS instead of giving a user limited access the DBMS functionality you already have. PS Relational databases do not use "links" (including FKs) to "connect" tables. Tables represent business relationships/associations & you get other relationships via queries the "connect" tables per how logical operators "connect" relationships. FKs are constraints--they prevent invalid states. – philipxy Mar 15 '18 at 18:28
  • Having read up on EAV, this looks more like its mirror universe counterpart. This thing stores *most* of the regular attributes of entities in normal-ish tables, just all references between tables go into this one. (The app does store the structure of the GUI as EAV but that’s a separate “layer” entirely.) – millimoose Mar 15 '18 at 18:31
  • Your comments are just too vague re user spec & implementation. Characterize user state at one moment relationally, then the mapping from that to the encoding tables including the one in your post. The question is then begged, why don't you implement the parts of the user's state that is some constrained tables *as straightforward DBMS state*. Yes, typically EAV is used for the part of the user state that is constrained gui tables that change over time. (But EAV should only be used if DDL is inadequate.) Maybe you mean the non-EAV part is encoded via your post table--see start of this comment. – philipxy Mar 15 '18 at 18:46
  • PS There is another anti-pattern "two/many/multiple FKs to two/many/multiple tables" where either one of several FK columns has a value & the others are null or a "FK" [sic] is sought to one of several tables. This is a wrong way to model database/SQL subtyping/inheritance/polymorphism. Seems like this could be a variant of that where there are 2 valid FKs instead of 1. [Eg.](https://stackoverflow.com/a/24421797/3404097) See my previous comments. – philipxy Mar 15 '18 at 19:38
  • @philipxy - this does seem closely related to that actually - `links` would be a polymorphic entity from this angle, except over (parts of) completely unrelated types, and a bunch of types that don’t actually even exist on the logical schema. – millimoose Mar 15 '18 at 19:40
  • @philipxy - I've given a simplified concrete example of the schema I'm actually working on, hope that helps. – millimoose Mar 15 '18 at 23:26
  • What is "is equivalent to" & why not "is"? Why doesn't OrderLines have an order id & item id? How is Link used? It looks like instead of having an id appear where it belongs in a FK column of a row with a referencing table's PK id where it belongs, the 2 ids are in a row in Links--for various PK-FK table pairs. Why?? Why not where it belongs plus a FK constraint? Are the 4 not hardwired with DDL but encoded by EAV with Links encoding FK meta-data? "Universal"--EAV? Please try to do what I said, give DDL & data for how the user sees their state then for how the implementation encodes it. [mcve] – philipxy Mar 16 '18 at 05:41
  • PS Sure sounds like your app is a hand-coded DBMS: "an application that lets you design data entry forms, and automagically generates the database schema to persist them as well as the browsing and editing UI." – philipxy Mar 16 '18 at 05:48
  • @philipxy "Is equivalent to" just means concessions to making the example minimal - the table names are in english, I removed all the columns not relevant to what I'm asking about. I'm not making up a more relational schema that has nothing to do with what I'm looking at, this is the sort of schema the application generates - "simple" attributes of entities that aren't references to other entities are encoded as columns of tables that correspond to those entities; relationships between any pair of entities are encoded as records in the same "links" table. – millimoose Mar 16 '18 at 12:42
  • @philipxy - I'm honestly at a loss as to how more complete while yet minimal I can make this. The end user just sees an editable data grid of customers, another of orders etc. The administrator sees a form designer, and access the generated schema directly when needed. As for your "why?"s, I can't give you an answer other than: because that's the choice someone made years back and is defending it to me now as equally valid to what you suggest that seems just as obvious to me as being the right answer. What I'm asking is not what the right answer is, but for arguments why it's the right one. – millimoose Mar 16 '18 at 12:46
  • @philipxy - one thing that might not have been clear so far - the app does not actually implement the RDBMS part, we're manipulating a Postgres schema as specified above. I said the given DDL is "equivalent" because the app is live-editable, so the tables end up getting altered; not because we're working with a substantially different encoding at the persistence layer. – millimoose Mar 16 '18 at 14:34

0 Answers0