1

I have a Magento shop (using MySql db) and just noticed that some developer introduced a custom db for capturing some structured data.

Now I noticed that the tables are not linked via foreign keys with each other, but just added a column e.g. priceListID = 01124 which is the same Id as on price list table. So linking the data together must happen within the code by firing different select statements I assume.

Now I am wondering if this needs to be fixed soon or if it actually is ok not to use foreign keys on db level to link data together? What are the down sides of doing this and are there maybe some benefits (like flexibility?)

Hope you can help me with this! Thanks a lot!

GMB
  • 216,147
  • 25
  • 84
  • 135
user3122136
  • 161
  • 4
  • 15
  • Foreign keys help with data integrity, but they're not absolutely necessary. If you're concerned about integrity, consider adding them. – tadman Apr 05 '19 at 20:23

2 Answers2

0

From your description, I understand that tables are indeed functionaly related, as they share a common piece of information (priceListID in the new table relates to id in the original table). On the one hand, this set-up would still allow writing queries that join the tables together.

The downside of not creating a foreign key to represent that relationship, however, is that, from database perspective, the consistency of the relationship cannot be guaranteed. It is, for example, possible that records are created in the new table where priceListID do not exist in the original table. It would also be possible to delete records in the old table while related records exists in the new one, hence turning the children to orphans.

As a conclusion: by not using foreign keys, the developers rely solely on the application to maintain data integrity. There is no obvious benefit not using the built-in features that the RDBMS offers to protect data consistency, and chances are developers just forgot that critical part of the table definition. I would suggest having a talk with them and intimate them to create the missing foreign key (unless they can give a clear explanation why they did not).

This should be as simple as:

ALTER TABLE newtable
    ADD CONSTRAINT fk_new_to_original_fk 
    FOREIGN KEY (priceListID ) 
    REFERENCES originaltable(id);

Please note that this requires all values in the referrencing column to be available in the parent table.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thanks for briefe explaination! Is it much work to add these foreign keys? Is it actually possible at all now that there are plenty of data already? I am asuming that the data integrity will not be 100% (and does not need to be since it is not business critical) but in future I would strive for better quality of data anyways. – user3122136 Apr 05 '19 at 21:17
  • @user3122136: welcome! Adding the foreign key can be done with a simple SQL statement, but it requires data to be consistent at the time when it is executed. I updated my answer. – GMB Apr 05 '19 at 21:23
  • Thanks! Any way of also "auto adding" all non existing records in the other table? It is not necessary that any field is filled when auto adding, just needs to be created so it can be created. – user3122136 Apr 05 '19 at 21:33
  • @user3122136: see [this SO post](https://stackoverflow.com/questions/21659691/error-1452-cannot-add-or-update-a-child-row-a-foreign-key-constraint-fails) – GMB Apr 05 '19 at 21:36
0

There're few advantages of keeping such constraints inside a database:

  1. Performance. Most of constraints, such as foreign keys, are better implemented if stored inside a database, close to the data. You want to check data integrity with additional select? You have to make an extra request to the database. It requires some time.
  2. What if you have several applications that work with your database? You have to write code for checking data integrity in all of them, which implies additional expenses.
  3. Synchronization. While you're checking data integrity with additional select, some other user may delete this data at the same time. And you will not know about it. Of course, these checks can be properly implemented, but this is still an extra work you have to do.

To me, this is all a smell of bad, not scalable design which can bring many problems. Data integrity is what databases are built for. And these types of verifications should stay inside a database.

Pavel Smirnov
  • 4,611
  • 3
  • 18
  • 28
  • Thanks for the reply! Regarding joining data with tables which are not connected, is there any performance measure how much slower this might be or is it the same speed if I join the data via SQL statement? – user3122136 Apr 05 '19 at 21:20
  • @user3122136, if you provide proper indexes on joined columns - there won't be much difference in terms of performance. But you still have no restrictions on data to be stored in those tables. You will face performance issues when you want to *validate* data before applying some action on it. These are quite different things. – Pavel Smirnov Apr 05 '19 at 21:27