0

I have a set of data where half of the records are original sources and the other half have the same attributes (fields) but some of those fields are calculated as a function of the set of original records.

As a simplified example, there are 3 original records, all of which have a given length:

id length 
1  50     
2  55     
3  70     

I want to add two more records, each of whose lengths are a function of the lengths of one of these original records. To do this, I added another column (related_id) which is a foreign key that references the table's original id. In this case, the length values for these additional records is 10 more than that of the record defined by the related_id field.

id length related_id
1  50     (none)    
2  55     (none)
3  70     (none)
4  60     1
5  80     3

For some reason, this mixing of flat records with records that have dependencies on them seems like it might cause problems down the road, so I want to verify that this is a legitimate approach for this type of scenario.

The alternative, in my mind, would be to create a second table that has just the foreign key linking back to the first, i.e., which will allow me to get their lengths in each case as a function of the original table:

id related_id
1  1
2  3

However, I need all these records to appear in the same table, where the id can be used as a foreign key in another table that needs access to all these records.

Of course, in this table, there are many different fields with various functional dependencies on the "original" set of records, so it becomes a bit more complex with that in the mix. What is the recommended practice for this type of situation?

(I am working with Pandas to pull data from tables and to do verification of each field.)

teepee
  • 2,620
  • 2
  • 22
  • 47
  • Both simple designs are fine. To explain why is to rewrite a textbook. The topic is "normalization to higher (than 1st) normal forms". Time to read a published academic textbook on information modelling, the relational model & DB design. (Manuals for languages & tools to record & use designs are not textbooks on doing information modeling & database design.) Dozens of published academic information modeling & database design textbooks are online free in pdf. stanford.edu has a free online course. (But asking for resources outside SO is off-topic.) PS "flat record" & "mixing" are not clear. – philipxy May 28 '19 at 23:31
  • Possible duplicate of ['What is Normalisation (or Normalization)?](https://stackoverflow.com/q/246701/3404097) – philipxy May 28 '19 at 23:38
  • Without going into textbooks, yes what you are doing is fine, assuming that each row only has one related id. If there are multiple relationships then a separate table is the way to go. And yes - you do have to be careful because if you are adding a foreign key then you always have to delete any other rows that rely on it first before you can update the parent row. So a simple delete of all rows will fail. Nothing unusual about that though. – TomC May 31 '19 at 04:53

0 Answers0