I have a set of data where half of the records are original sources and the other half have the same attributes (fields) but some of those fields are calculated as a function of the set of original records.
As a simplified example, there are 3 original records, all of which have a given length:
id length
1 50
2 55
3 70
I want to add two more records, each of whose lengths are a function of the lengths of one of these original records. To do this, I added another column (related_id
) which is a foreign key that references the table's original id. In this case, the length values for these additional records is 10 more than that of the record defined by the related_id
field.
id length related_id
1 50 (none)
2 55 (none)
3 70 (none)
4 60 1
5 80 3
For some reason, this mixing of flat records with records that have dependencies on them seems like it might cause problems down the road, so I want to verify that this is a legitimate approach for this type of scenario.
The alternative, in my mind, would be to create a second table that has just the foreign key linking back to the first, i.e., which will allow me to get their lengths in each case as a function of the original table:
id related_id
1 1
2 3
However, I need all these records to appear in the same table, where the id
can be used as a foreign key in another table that needs access to all these records.
Of course, in this table, there are many different fields with various functional dependencies on the "original" set of records, so it becomes a bit more complex with that in the mix. What is the recommended practice for this type of situation?
(I am working with Pandas to pull data from tables and to do verification of each field.)