7

I am looking for the best-practice way to insert data to multiple MySQL tables where some columns are foreign key dependencies. Here is an example:

Table: contacts
--------------------------------------------------------------------
|  contact_id  |  first_name  | last_name  |  prof_id  |  zip_code  |
--------------------------------------------------------------------

The 'contacts' table has PRIMARY KEY (contact_id) which simply auto_increments, and FOREIGN KEY (prof_id) REFERENCES 'profession' table, FOREIGN KEY (zip_code) REFERENCES 'zip_code' table. Those tables would look something like this:

Table: profession
----------------------------
|  prof_id  |  profession  |
----------------------------

where 'prof_id' is an INT NOT NULL AUTO_INCREMENT PRIMARY KEY, and

Table: zip_code
--------------------------------
|  zip_code  | city  |  state  |
--------------------------------

where 'zip_code' is an INT(5) NOT NULL PRIMARY KEY.

I have a new person I want to add a record for, let's say:

first_name = 'John', last_name = 'Doe', profession = 'Network Administrator', city = 'Sometown', state = 'NY', zip_code = 12345

Here's what I'm trying to do: I want to take that information and insert it into the appropriate tables and columns. To prevent duplicate values in a column (like profession for example) I'd first want to make sure there isn't already an entry for "Network Administrator", and if there was I'd like to just get its key value, if not insert it and then get its key value. The same goes for zip_code, city, state - if it exists just use that zip_code key, otherwise insert the new data and grab the associated key. Then, finally, I'd want to enter the new contact to 'contact' table using the supplied information, including the appropriate key values associated with profession and location from the other tables.

My question is, what is the best recommended way to do this? I know I can sit here and write single statements to check if the given profession exists, if not then enter it, then get the key. Do the same for zip_code. Then finally Insert all of that into contacts, however I know there must be a better way to accomplish this in fewer (perhaps one) statement, especially considering this could cause a problem if, say, the database went offline for a moment in the midst of all of this. Is there a way to use JOINs with this INSERT to essentially have everything cascade into the correct place? Should I look to handle this with a TRANSACTION series of statements?

I am in the learning stage with SQL but I feel that the books and resources I have used thus far have sort of jumped to using nested queries and JOINS under the assumption we have all of these tables of data already populated. I'm even open to suggestions on WHAT I should be Googling to better learn this, or any resources that can help fill this gap. Ideally, I'd love to see some functioning SQL code to do this though. If necessary, assume PHP as the language to interact with the database, but command-line sql is what I was aiming for. Thanks ahead of time, hopefully I made everything clear.

Brandon K
  • 751
  • 1
  • 4
  • 12
  • 1
    You should learn ORM. I am using Doctrine 2.2. By using ORM you will be able to insert/update/delete and Doctrine will take care of all relations. You want have to do anything by hand. Doctrine is using objects. All model classes Doctrine will generate for you. – pregmatch Aug 05 '13 at 23:03
  • 1
    I know ORM will certainly be of value, and so I thank you for the information. However I wanted to understand how this works fundamentally using SQL directly first, which @seanmk answered for me below. Either way, I think your information may be useful to anyone else stumbling on this question so I am giving it a +1 as well. – Brandon K Aug 06 '13 at 01:15

1 Answers1

5

In short, you want to use transactions (more doc on this) so that your inserts are atomic. This is the only way to guarantee that all (or none) of your data will be inserted. Otherwise, you can get into the situation you describe where the database becomes unavailable after some insertions and others are unable to complete. A transaction tells the database that what you are doing is all-or-nothing and so it should roll back if something goes wrong.

When you are using synthetic primary keys, as you are, PHP and other languages provide mechanisms for getting the last inserted id. If you want to do it entirely in MySQL you can use the LAST_INSERT_ID() function. You will end up with code like this:

START TRANSACTION;
INSERT INTO foo (auto,text)
    VALUES(NULL,'text');         # generate ID by inserting NULL
INSERT INTO foo2 (id,text)
    VALUES(LAST_INSERT_ID(),'text');  # use ID in second table
COMMIT;
seanmk
  • 1,934
  • 15
  • 28
  • Thank you for a fast and concise response. I have also just read up on using variables in MySQL which I think will help me here since I will need to keep track of multiple previous insertion id values rather than just the immediate previous one. Further reading has lead me to believe I will need to do something like "INSERT IGNORE INTO ..." or "INSERT ... ON DUPLICATE KEY UPDATE" to check for existing values in a given column and avoid duplicating information - does this sound about right to you? – Brandon K Aug 06 '13 at 01:07
  • 1
    Yes, but be careful because INSERT ... ON DUPLICATE KEY UPDATE will lock the entire table you are trying to insert into, so its use can lead to a lot of database contention as other requests must wait for them to complete. On the other hand there are issues with INSERT IGNORE as discussed here: http://stackoverflow.com/questions/548541/insert-ignore-vs-insert-on-duplicate-key-update – seanmk Aug 06 '13 at 01:18