Data modelling. How to model hundreds of dependent fields, 1:n or as one table?

Question

i have to store leasing cars that come with graduated progressive running costs, for example:

main data:

Id:    123  
Model: Ford Transit
yom:   2013

graduated variable running costs:

kilometers/year utilities  tires service misc
        10,000        80     400      50   30
        12,000       100     400      55   35
        14,000       120     400      60   40
          ...
       100,000       500   1,500     150  100

So with a scale of 10,000 to 100,000 in steps of 2,000 and 4 different cost types, this results in 50 x 4 = 200 data fields.

Is it better to model this as a 1:n relation or is it ok to have more than 200 columns in my car table (MySql)? What pros and cons do exist?

In case of 1:n, would a trigger make sense to create the 200 fields in the variable costs table whenever a new car record is entered?

thanks in advance for any kind of hint

For each example record, are the four cost fields mandatory, are any of them optional ? — PerformanceDBA, May 30 '15 at 14:14

score 3 · Accepted Answer · edited May 23 '17 at 11:44

So with a scale of 10,000 to 100,000 in steps of 2,000 and 4 different cost types, this results in 50 x 4 = 200 data fields.

Yes. When that is Normalised, the 200 fields per record are transformed into 4 columns per Normalised row.

Is it better to model this as a 1:n relation or is it ok to have more than 200 columns in my car table (MySql)?

Well, if you are considering relations, it is a 1::n relation.

It is never acceptable to implement un-normalised records (eg. in your case, one of 200 fields) in a database.

What pros and cons do exist?

I can't give a tutorial here, but in general, Normalised rows have (a) Relational Integrity, (b) Relational power [you will notice that in the form of reduced joins], (c) Relational speed (d) ease of extension [addition/change to the database], and (e) ease of coding. None of which un-normalised records provide, they provide the opposite.

In case of 1:n, would a trigger make sense to create the 200 fields in the variable costs table whenever a new car record is entered?

In general, triggers are not required in Relational databases. They are only required when one needs to do weird things, and weird things are the result of not Normalising the data.

In your case, the 200 values for the new car being added are not known, so a trigger wouldn't work anyway.

No, do it properly, in an ACID Transaction, as and when the user adds a new car, and populates the four columns per kilometre step. You may well have 4 default values that auto-fill the fields on the screen, but that should not be saved to the database until the user hits the Add button..

Data Model

Here is the data model that you need.

Vehicle Maintenance Data Model
I have Normalised the data, without explanation of the steps or process, this is the end result.
- Eg. Ford Transit is not Atomic, I have made it Atomic by separating Manufacturer and Model.
Mandatory/Optional costs:
- If the four cost fields appear for every Maintenance (Kilometre step) row, you need the model on the right
- If any of the cost fields are optional, you need the model on the left. I have assumed that Service is mandatory, ie. it occurs for every kilometre entry.
I have given you Relational Keys, and thus Relational Integrity. If you do not use compound keys, you will lose that integrity. Eg:
- A Model (eg. Transit) does not exist independently, it exists only in the context of a Manufacturer (eg. Ford).
- A ModelYear (eg. 2013) does not exist independently, it exists only in the context of a Model (eg. Ford Transit).
- You may substitute the names codes, if you think they are too wide.
The Maintenance table needs a simple CHECK constraint to ensure that the Kilometre step is modulo 2000.

That is an IDEF1X model. IDEF1X is the Standard for modelling Relational Databases. Please be advised that every little tick; notch; and mark; the crows foot; the solid vs dashed lines; the square vs round corners; means something very specific and important. Refer to the IDEF1X Notation. If you do not understand the Notation, you will not be able to understand or work the model.

Record Id

Note that these days many people do not understand how to create Relational tables, with unique rows, as required by the Relational Model. They create files, typically with a Record ID each, and allow masses of duplicates. Please read this Answer, from the top to False Teachers. And try the code given, and linked, therein.

Warning

Zohar:
I would also suggest a lookup table for car models, so that each row in the cars table would have a model id instead of the model name.

That sort of advice comes from people who have read the books of false teachers, and who apply it, and advise it, with no actual knowledge or experience whatsoever. It keeps everyone stuck in pre-1970'2 ISAM Record Filing Systems, while thinking they have a "relational database".

They simply do not know the Relational Integrity, Power, and Speed, that they are missing. If someone asks a question, I will provide an answer. Not here, this answer is complete.

thanks for the elaborated answer, even if i didn´t intend to start a competition of DB scholarship ;) — Prefect73, Jun 01 '15 at 07:21
@Prefect73. You are most welcome. I will try to address the claims made in the other answers, for which no evidence has been supplied, even after I have challenged them, elsewhere. — PerformanceDBA, Jun 01 '15 at 13:51

Carl Prothman · Answer 2 · 2015-05-30T12:50:21.310

-1

You could go with a data model which has pre-populated look up tables (right side). Then apply that maintenance to the actual vehicle (left side) ERD

edited May 30 '15 at 12:50

answered May 29 '15 at 17:00

Carl Prothman

1,461
13
23

1

You have the right idea, but you ruin it with the IDs stamped on every file. – PerformanceDBA May 30 '15 at 09:48
Files? We are talking about tables in a relational database, right? The above is a logical ERD, which is all about the Entity relationships (PK and FK placement). Get those right and you will have a great data model. :) – Carl Prothman May 30 '15 at 12:34
1

*tables in a relational database, right?* No, they are files, with none of the capabilities of a relational table. CREATE TABLE does not imply the creature is a relational table. PRIMARY KEY does not imply a correct PK. *ERD etc* Yes! That's why I said *you have the right idea*. But you have not got any of it *right*. You have (a) no relational tables (b) duplicates in all tables (c) no Relational integrity (d) no Relational navigation (e) you break the **Access Path Independence** rule in the *Relational Model*. – PerformanceDBA May 30 '15 at 12:48
Relational tables by definition have FK. That is what makes a relational database relational. :) I will agree its a logical ERD, not a physical one. – Carl Prothman May 30 '15 at 12:54
1

No idea whose "definition" you are using. I am using Dr E F Codd's, as stated in the *Relational Model*. An FK does not "define" the creature as a relational table, it defines the creature as having an FK. The element that differentiates a Relational table from a pre-relational file, is that the rows are related by RELATIONAL KEY, whereas in the pre-1970 files, the records are related by RECORD ID. Therefore, the use of ID as "pk" is a regression to pre-1970 technology. I suggest you handle the unlimited duplicate rows that you allow, before trying to understand Relational Keys. – PerformanceDBA May 30 '15 at 13:01
Well, I've put forth my proposed data model via logical ERD. How would you model the problem? What would your ERD look like? Just curious... – Carl Prothman May 30 '15 at 13:39
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/79195/discussion-between-performancedba-and-carl-prothman). – PerformanceDBA May 30 '15 at 13:43
1

(a) It gets silly, when I comment, so that you can correct your files, instead of voting you down, and then you skip 15 steps and ask how I would model it (b) *How* Relationally. No duplicate rows. Full Relational Integrity. (c) *Look like* It would *look* similar to yours, with the errors corrected. But is would be vastly different, a database instead of a set of linked files. – PerformanceDBA May 30 '15 at 13:53
1

[**This code**](http://sqlfiddle.com/#!9/761aa/1) (a) proves the duplicates with surrogate ID fields in the file, (b) shows how to remove them, by using a Relational Key, (c) removes the superfluous ID, and (d) elevates the file to a Relational table. Try each numbered step. – PerformanceDBA May 30 '15 at 14:45
1

(a) You added some fields. Good (b) I have provided an answer, including a model. Explanation tomorrow. – PerformanceDBA May 30 '15 at 15:05
1

@PerformanceDBA I've read your ID comments, and understand what you are saying now. :) This will be a hard habit to break, as I've always used IDs as PKs. The one thing I'm still confused about is: If a natural key changes over time (e.g. last name), then you don't want to use that as the PK, since updating a PK is bad, right? Hence the use of a ID column. Your thoughts please. – Carl Prothman May 31 '15 at 13:32
1

You should choose a *stable* PK, yes. Ambler, Darwen, etc, cannot tell the difference between *stable* and *static*. There is nothing static in the data, never will be. After all, it is data, and data, by definition, changes. (a) If the last_name changes, the last_name changes. (b) Updating a PK is not "bad", it is immature: we write transactions to delete it and insert the new one, and to handle any children. Enterprise class SQL does not have `CASCADE.` (c) IDs prevent Relational Integrity (d) IDs present an **Access Path Dependence** which is prohibited in Codd's *Relational Model*. – PerformanceDBA May 31 '15 at 15:55
1

(e) Therefore, contrary to the myths propagated by the circus freaks, IDs cause more, not less, joins. (f) Comments are limited. If you would like a full discourse, ask a new question, and ping me. – PerformanceDBA May 31 '15 at 20:19

Data modelling. How to model hundreds of dependent fields, 1:n or as one table?

2 Answers2

Data Model

Record Id

Warning