0

UPDATE: "Setting foreign_key_checks to 1 does not trigger a scan of the existing table data. Therefore, rows added to the table while foreign_key_checks = 0 will not be verified for consistency." SOURCE: 5.1.4. Server System Variables -- foreign_key_checks -- As a result, it appears turning OFF foreign_key_checks is NOT an option...

Need to load data from a single file with a 100,000+ records into multiple tables on MySQL maintaining the relationships defined in the file/tables; meaning the relationships already match. The solution should work on the latest version of MySQL, and may use either the InnoDB or MyISAM engine.

I am a completely new to all this and have very little experience auto-generating IDs and establishing foreign key relationships. Any pointers would be appreciated.

See UPDATE note above: I might add that it is not a requirement to have the updates made on a live database, meaning it is OKAY to disable foreign key constraints, then execute the inserts, enable the constraints again. Since it's my understanding that if there is something wrong with the database's referential integrity, the operation will fail.

All approaches should include some from of validation and a rollback/cleanup strategy should an insert fail, or fail to maintain referential integrity.

Again, completely new to this, and doing my best to provide as much information as possible, if you have any questions, or request for clarification -- just let me know.

Thanks!


SAMPLE DATA: To better elaborate with an example, lets assume I am trying to load a file containing employee name, the offices they have occupied in the past and their Job title history separated by a tab.

File:

EmployeeName<tab>OfficeHistory<tab>JobLevelHistory
John Smith<tab>501<tab>Engineer
John Smith<tab>601<tab>Senior Engineer
John Smith<tab>701<tab>Manager
Alex Button<tab>601<tab>Senior Assistant
Alex Button<tab>454<tab>Manager

NOTE: The single table database is completely normalized (as much as a single table may be) -- and for example, in the case of "John Smith" there is only one John Smith; meaning there are no duplicates that would lead to conflicts in referential integrity.

The MyOffice database schema has the following tables:

Employee (nId, name)
Office (nId, number)
JobTitle (nId, titleName)
Employee2Office (nEmpID, nOfficeId)
Employee2JobTitle (nEmpId, nJobTitleID)

How can I use MySQL to load the file into the schema above Auto-Generating IDs for Employee, Office and JobTitle and maintaining the relationship between the employee and offices, and employee and Job Titles?

So in this case. the tables should look like:

Employee
1 John Smith
2 Alex Button

Office
1 501
2 601
3 701
4 454

JobTitle
1 Engineer
2 Senior Engineer
3 Manager
4 Senior Assistant

Employee2Office
1 1
1 2
1 3
2 2
2 4

Employee2JobTitle
1 1
1 2
1 3
2 4
2 3
blunders
  • 3,619
  • 10
  • 43
  • 65

1 Answers1

1

I would upload all the files onto a staging database with following tables:

Temp_Employee (nId, name) Temp_Office (nId, number) ...

There would be not contraints or FKs on these tables. If the records are uploaded, then you can add id's for the records, check the integrity and then move them to the live database (disabling the fks, moving the data, enabling the fks again)

HamoriZ
  • 2,370
  • 18
  • 38
  • @Zoltan Hamori: Possible I'm miss understanding you, but it sounds like you're suggesting I add the the IDs by hand -- is that correct? Guess my understanding was that it would make more sense to have MySQL generate them every time a new instance was discovered. Plus, really need the the answer to have the required SQL statements needed to complete the task from start to finish. – blunders Nov 10 '10 at 15:33
  • When your temp tables are populated with the records, then you just updates the id column with a sequence, so it can be automatized. For example: UPDATE temp_employee SET ID=seq_emp.NEXTVAL – HamoriZ Nov 10 '10 at 15:35
  • @Zoltan Hamori: In looking into your suggestion, I ran across an issue, I was wrong... if foreign_key_checks is turned OFF, when it's turned back on the DB will not back check the DB for referential integrity, so turning it off is not an option... any suggestions? I've added an update to the body of my question linking to the MySQL DOCs that state this. Thanks! – blunders Nov 10 '10 at 16:13
  • Ok, I see. Is it possible to append rows to the table? – HamoriZ Nov 10 '10 at 16:24
  • @Zoltan Hamori: By append do you mean in batch, or on a row-by-row basis? – blunders Nov 10 '10 at 16:42
  • The new records will be physically at the very end of the table. The insert process is much faster in this case. I'm not sure, that mysql have this feature. – HamoriZ Nov 10 '10 at 16:46
  • @Zoltan Hamori: Here's the way I ended up doing it... Cheers! http://stackoverflow.com/questions/4175566/using-pentaho-kettle-how-do-i-load-multiple-tables-from-a-single-table-while-kee – blunders Nov 14 '10 at 21:48