2

I come here to present a problem I've been banging my head on for the last couple of days while developing a design for a database.

A brief introduction:
The database is intended to store information about motortypes (not about actual motors, but more about their design, parameters and such), their components/attributes and relationships between certain motortypes (each motortype is related to a so-called base motortype). So we have several motortypes of which each has many attributes (I call them detailtypes). Each of the detailtypes in turn can be present in one or more motortypes. Each motortype:detailtype combination should have (exactly) one value.
I need to be able to compare the values of a specific detailtype for two different motortypes and store information based on this comparison. (The storage of this information is the database's central purpose.)


Now the first and simplest approach would be to make a motortype table and make it have a column per detailtype so I could store exactly one value per combination:

Approach #1:

CREATE TABLE Motortype(
id INTEGER PRIMARY KEY,
description CHAR (50),
detailtype1 CHAR (50),
detailtype2 INTEGER,
detailtype3 YES/NO,
detailtype4 DOUBLE );

There will be about one hundred (this number is likely to slowly grow in the future) such detailtypes to be entered into the DB. Adding more corresponds to adding more columns (manageable), changing them would result in renaming columns (bad idea?). The problem occurs when I want to compare the motortypes with the IDs 2 and 5 on their detailtype2 values and save information about this comparison. I don't know how one would manage that. (Also the attributes can't have the NOT NULL constraint, as some data might simply be unknown at the time the entry is created...)
That's why I went on to approach 2, which seems to implement EAV in some way.


In order allow storing the comparison informations I moved the detailtypes from being an attribute to mototype to their own entity, while leaving those that need no comparison as attributes:

Approach #2:

CREATE TABLE Motortype(
 id INTEGER PRIMARY KEY,
 description CHAR (50),
 notComparableAttribute1 CHAR (50),
 notComparableAttribute2 INTEGER
);

CREATE TABLE Detailtype(
 id INTEGER PRIMARY KEY,
 description CHAR (50),
 datatype CHAR (3)
);

CREATE TABLE MotorDetail(
 id INTEGER PRIMARY KEY,
 motortype_id INTEGER CONSTRAINT fk_mt_id REFERENCES Motortype(id) NOT NULL,
 detailtype_id INTEGER CONSTRAINT fk_dt_id REFERENCES Detailtype(id) NOT NULL,
 value CHAR (50) NOT NULL
);

This configuration then allows me to refer to any specific pair of values in MotorDetail and save additional information for this comparison. (Note that there are more dimensions to this comparison - it does not have a 1:1 relation to MotorDetail. Rather it's a 1:n MotorDetail:ComparisonData relation. Otherwise I could just directly store all the comparison information in MotorDetail.)
However, the fact that I now have a value attribute which needs to hold several different data types (strings, integers, floats, booleans) means that the responsibility for input validation is moved from the database to me (I need to make sure the users can only enter the data in a way that makes sense for the respective Detailtype). Which is completely fine - it's just a little more programming work, but I had the feeling it could be solved in a more elegant way.


So after some research on polymorphic associations I came up with the next approach (Motortype and Detailtype are as in #2):

Approach #3:

CREATE TABLE MotorDetail(
 id INTEGER PRIMARY KEY,
 motortype_id INTEGER CONSTRAINT fk_mt_id REFERENCES Motortype(id) NOT NULL,
 detailtype_id INTEGER CONSTRAINT fk_dt_id REFERENCES Detailtype(id) NOT NULL,
 valueTxt CHAR (50),
 valueInt INTEGER,
 valueDbl DOUBLE,
 valueBoo YES/NO
);

The value attributes can't have NOT NULL constraints, as 75% of the values will be null (as per requirements there can only be one value for each MotorDetail). Querying for values seems ok as long as I only need those of a specific data type, otherwise it gets more complicated (e.g. to display them on a form). Or maybe I should say tedious instead of complicated.


So in a last attempt at the problem I present my final approach (again, Motortype and Detailtype as in #2):

Approach #4:

CREATE TABLE MotorDetail(
 id INTEGER PRIMARY KEY,
 motortype_id INTEGER CONSTRAINT fk_mt_id REFERENCES Motortype(id) NOT NULL,
 detailtype_id INTEGER CONSTRAINT fk_dt_id REFERENCES Detailtype(id) NOT NULL
);

CREATE TABLE MotorDetailValueTxt(
 id INTEGER PRIMARY KEY,
 valueTxt CHAR (50)
);

CREATE TABLE MotorDetailValueInt(
 id INTEGER PRIMARY KEY,
 valueTxt INTEGER
);

... (same for MotorDetailValueDbl and MotorDetailValueBoo)

I'm not sure on this one. Not sure how to represent the idea that each id in the value tables is tied to exactly one MotorDetail(id) and needs to be unique across all value tables. Not even sure I understood the underlying concept fully. But what I'm sure of is that doing it this way the DB can't enforce referential integrity. That's why I'm not going to use it. In a test DB I was able to produce a query with UNION to get all the different values, but I don't think I want to do any housekeeping down at table entry level (to somehow ensure no dead entries remain).


The question:

Given the details outlined so far, is there a possible solution that does not involve some coding wizardry of some kind (VBA or otherwise)? An elegant solution that somehow allows handling this all at the database level (tables+relations as opposed to reports, forms, etc...)? As everything I've come up so far seems rather clunky.


Note: I am quite the newcomer to database design and databases in general. I've had 3 days of training in MS Access, but that was mostly tailored towards learning the GUI and building some really basic DBs. About anything else I know (or assume to know) about DB development was learned by reading blogs - and most importantly: Questions and answers here on SO.
It seems safe to assume that I did not (fully) understand one or more of the concepts I discussed and in consequence misrepresented some of it. If so, or if anything else is missing, please point it out to me in the comments or make an edit to improve my question (and to allow me to read up on it). :)

Edit: An edit was made, and a large one a that. I tried to add information, clarify what was there and still keep the original character and intent of the question. Let's hope I succeeded at least partially. :)

Inarion
  • 578
  • 3
  • 14
  • @philipxy Thanks for your comment. Reading about subtyping/polymorphism was what lead to this question. I'm currently reading up on "FKs to multiple/two/many tables". I certainly do not intend to use EAV, maybe my table with example values is written in a confusing way? I'll try to clarify that in an edit. I do have several books on the topic of database design, however they don't mention more complex relations (they are likely too much oriented towards beginners). I did actually start with a really simple design, it just became rather complex quite quickly due to business requirements. – Inarion Aug 23 '17 at 14:10
  • @philipxy Regarding "text in question, not in links/images": Isn't that how I did it? The only text qualifiying as being "in links/images" is the text in the images' "captions" (due to me being not yet allowed to post inline-images) as well as the text in the screenshots. I'm not sure what you mean here. And re "Format tables using code format": I did. The only table in my question is formatted as code. Or are you saying I should present the relations in the linked images as ASCII-art? – Inarion Aug 23 '17 at 14:18
  • I think I get now what you say with regards to EAV. :( But I'm not sure that doing it differently is actually feasible. As mentioned, there are about 100 details describing each motor. That would result in either creating one huge table (100 columns) or about 100 tables with a huge linking table between motor table and the detail tables. But in any case I think I need to go back and reevaluate my model. – Inarion Aug 23 '17 at 14:39
  • 1
    Your images are pictures of DDL. [Give the DDL.](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557) Which you should anyway--[mcve]. I'm not sugesting EAV; it's just that one column gives the type of another in both EAV--where the data is essential to the representation--and in some subtyping idioms--as a variant tag that is redundant & used for constraints. Re an initial design forget "huge", find straightforward relation(ship)s/associations. Rearrange later. Remember--DBMSs already have metadata & DDL commands. – philipxy Aug 23 '17 at 17:34
  • @philipxy Thank you for your input so far, I really appreciate it. After taking into account what you asked/told me I realized that what might look like an appropriate solution for the tables as presented will most definitely clash with other requirements I have to fulfil. (I did not include those since a) I thought they would not matter, and b) it would have made everything much more complicated, and c) in presenting the whole scheme I would most certainly breach confidentiality.) – Inarion Aug 24 '17 at 12:01
  • So after all my question seems not to be a good fit in its current state, without too much room to have it actually representing my core problem. What to do with it? Should I vote to delete it? Or keep the post for a (hopefully more clear-cut) question I might have in the future? – Inarion Aug 24 '17 at 12:04
  • 1) I realized as much. The problem is that in order to properly clarify the question, I'd need to include way more information (there are at least 5 more entities that are somewhat relevant). This would bloat the question's size even more and, more importantly: I don't think I can simply reveal all that information just like that. I would at least need to check with my supervisor lest I violate my contract. (OTOH if it was a hobby project, I'd be happy to include all there is to the topic.) So in conclusion I feel unable to raise my question's quality enough to make it clear and answerable.... – Inarion Aug 24 '17 at 13:30
  • ... And thus I wondered what to do with it. As I don't want to steal more of your time - I think you've already helped me a great deal on this matter. Now it's time for me to research some more and come back if (/when) more concrete questions arise. :) As for your part 2) I mostly end up on SO when googling something technical, so I already spent quite some time reading Q&A on whatever questions I could think of. Now you've given me some more helpful keywords to read up on, so thanks! :) – Inarion Aug 24 '17 at 13:32
  • 1
    I suspect your issues are more simple & basic than in your question. Again, its design is a complex version of some simpler one not yet given. (Surely you can give a scrubbed example parallel to the actual?) Then the question us, given the (combined) issues what are reasonable tradeoffs & idioms. Re my last comment, read the first page hits from googling 'stackoverflow' with '(database OR sql) subtype', 'many null columns or many tables', 'foreign key to multiple tables' & 'EAV'. (Re EAV https://stackoverflow.com/a/23950836/3404097 & https://stackoverflow.com/a/33557166/3404097). Good luck. – philipxy Aug 25 '17 at 03:50
  • I started designing my DB from scratch and at a different angle than before but ended up with a similar design... See my edited question for hopefully more clarity. I think your notion that I was thinking more complex than necessary might stem from the fact that I withheld some information I thought unnecessary, but which actually (from my point of view) required the more complex solution with separate Detailtype and MotorDetail tables. – Inarion Sep 01 '17 at 15:44
  • I may not have time to read your question for a while, so--please (as with any edit) if it invalidates my answer (any reasonable answer) post another question & roll this one back to the version I addressed & edit a link to the new at the top of the old explaining that you extended it there.(We can click 'edited' then click on a version's 'rollback' to return to that version.) I'll inline the image. I felt the question was clear enough for a helpful answer. (Also, we are asked to keep a question up to date without reference to earlier versions, and without social greetings etc.) – philipxy Sep 01 '17 at 19:33

1 Answers1

1

Your issues seem more simple & basic than a comparison of your designs. Which seem more complex than necessary or desired for the given fragment of your application.

A business relation(ship)/association is represented by a table. We can characterize a relation(ship)/association/table by a predicate--a column-parameterized statement template. A table holds the rows that produce a true proposition--statement.

You haven't given any reason for a design any more complicated than:

-- motor motorID has (non-details) ...
Motor(motorID, ...)
-- motor motorID has NumOfCylinders cylinders
MotorNumOfCylinders(motorId, NumOfCylinders)
-- motor motorID has cylinder material CylinderMaterial
MotorCylinderMaterial(motorId, CylinderMaterial)
...

Do you want just one motor table?

--      motor motorID has (non-details) ...
--  AND motor motorID has NumOfCylinders cylinders
--  AND (   NumOfCylinders > 0 and motor motorID has cylinder material CylinderMaterial ...
--      OR  NumOfCylinders = 0 and CylinderMaterial = NULL ...
--      )
...
-- PK (motorID)
Motor(motorId, ..., NumOfCylinders, CylinderMaterial, ...)

You haven't explained what can change over time. Do you want a changing association of a string with a given detail? For optional strings:

-- detail "NumOfCylinders" is written NumOfCylinders
NumOfCylindersString(NumOfCylinders)
-- detail "CylinderMaterial" is written CylinderMaterial
CylinderMaterialString(CylinderMaterial)
...

Do you want just one such table? For obligatory strings:

--     detail "NumOfCylinders" is written NumOfCylinders
-- AND detail "CylinderMaterial" is written CylinderMaterial
...
DetailStrings(NumOfCylinders, CylinderMaterial, ...)

Do you explicitly want different kinds/types of motors so that you have static constraints when dealing with a particular kind/type?

-- motor motorID ... and has NumOfCylinders cylinders and weighs Weight kg and ...
Motor(motorId, ..., NumOfCylinders, Weight, ...)
-- piston motor motorID has cylinder material CylinderMaterial ...
PistonMotor(motorId, CylinderMaterial, ...)
--       electric motor motorId has ...
--   and it is isAC that it takes AC current
--   and it is isDC that it takes DC current
...
ElectricMotor(motorId, isAC, isDC, ...)
...

Do you want to constrain a motor to be of exactly one kind/type at the cost of redundancy & calculation?

-- motor motorId is of type motorType and ...
Motor(motorId, motorType, ...)
-- PistonMotor enforce (motorId, 'piston') in (select motorId, motorType from Motor)
-- ElectricMotor enforce (motorId, 'electric') in (select motorId, motorType from Motor)
...

Do you want to declaratively constrain instead per a different cost tradeoff (via a stored or generated column value)?

-- motor motorID has type motorType and motorType = 'piston' and cylinder material CylinderMaterial ...
-- FK (motorID, motorType) references Motor (motorID, motorType)
-- check (motorType = 'piston')
PistonMotor(motorId, motorType, CylinderMaterial, ...)
-- motor motorID has type motorType and motorType = 'electric' and ...
-- FK (motorID, motorType) references Motor (motorID, motorType)
-- check (motorType = 'electric')
ElectricMotor(motorId, motorType, isAC, isDC, ...)
...

More?

-- Motor check (NOT (motorType = 'electric' AND NumOfCylinders <> 0))

Can a detail have different values over time? Do you want to have different details over time? You can use DDL to implement changes. Do you want to know what details or detail values there are currently? Query metadata. Or there may be some non-system table or its pivot that is 1:1 with rows of a system table or its pivot.

Do you somehow think/assume/suspect/fear that implementing user-directed state changes is somehow better to be done via DML than DDL, at the expense of complexity & processing? Has DDL-update implementation been shown to be too slow? You can encode those database states partly or fully via EAV with its concomitant pros & cons.

An EAV 'database' [sic] is literally mathematically straightforwardly an undocumented description in triples of a database and its metadata, with no functionality to

To paraphrase Greenspun, any sufficiently complex EAV project contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of a DBMS.

(Observe that one column gives the type of another in both EAV--where the data is essential to the representation--and in some subtyping idioms--as a variant tag that is redundant & used for constraints. Is there a way to represent relationship between a tuple and to a table in E-R diagram?)

Given a straightforward design and the kinds of queries & updates you want, it can be rearranged to another design. Other designs' base tables are views/queries of basic design tables, and vice versa.

I have only given constraints when a design was modifed specifically to promote simpler expression of constraints. The base predicates and the business situations that can arise per business rules determine the database states that can arise and hence the database constraints. We don't need to know constraints to query.

philipxy
  • 14,867
  • 6
  • 39
  • 83
  • Thank you for investing so much time into my problem. I've been reading in one of my books on data modeling and it provided some very helpful insights on how to reliably draft a model from "user scenarios". (The other books weren't taking much about it.) I suspect in my last attempt I might have skipped some important steps. I'll now have another go a the design and will report back soon! I will provide details for your inquiries if still necessary or otherwise revise my question - depending on the situation. – Inarion Aug 25 '17 at 08:52
  • Sorry for not responding to the 1:n thingy. What I meant was to avoid situations as in approach #2, where MotorDetail_ID was a FK to three different tables and needed to be unique across all three (so I'd basically need a common index for the three tables, which Access does not provide). I'm still not used to talking in this DB-specific language, so I apologize for not being as clear as I'd like. How to proceed with my question? Should I try to make it somewhat clearer by responding to you requests for clarification? Should I mark your answer as accepted? – Inarion Aug 25 '17 at 12:53
  • 1
    Thanks. (Still puzzled, eg ERD 2 says 3 FKs *to* MotorDetail.) Re edit and/or ask anew, I suggest clarified design descriptions but otherwise as is, seeking other answers, or as practice. You mention constraint problems, but don't give the constraints. Or explain how to set or interpret tables given some situation where some motors have some details with some values for some types or clarify the constraints! Per my answer, constraints do not do that; predicates do. Then predicates & possible situations determine the constraints. Re accepting: https://meta.stackexchange.com/a/5235/266284 – philipxy Aug 25 '17 at 15:34