0

Basically have many huge delimited files that I know I can import as a table, but I need to map that data to an existing rational multi-table MySQL database. There should not be any conflict with datatypes, but I'm super new to this, so please point out anything I should be watching for. Clearly I'm not going to run this in production either until I know it works.

Not 100% sure stackoverflow is the right place to ask a database question, but I couldn't find any other Stack Exchange that was a better fit.

Posted this question on SuperUser looking for a GUI to do this, but I up for coding this is it gets the job done. As such there is no target language, just the requirement that the database be MySQL.

Also, found this stackoverflow Q/A that deals with MS-SQL's SSIS (which I'm not planning on using due to cost, but the content and issues faced are of the same nature it appears.) -- Loading Multiple Tables using SSIS keeping foreign key relationships

Community
  • 1
  • 1
blunders
  • 3,619
  • 10
  • 43
  • 65

1 Answers1

1

I'd suggest using the ETL(extract translate load) tool from the Pentaho Business Intelligence package. It's got a bit of a learning curve but it'll do exactly what you're looking for. Their ETL tool is called Kettle and it's extremely powerful once you get the hang of it.

There are two versions of Pentaho, an enterprise version that has a free trial, and a free community version. The community version is more than capable but you might give the enterprise version a test ride too.

Here's some links

Pentaho Community Edition Site

Kettle Site

Pentaho Enterprise Site

Update: Multiple table outputs

One of the key steps in your transformation is going to be a combination lookup-update. This step checks a given table to see if a record from your data-stream exists and inserts a new record if it does not. Regardless of whether it's a new or old record it's going to append the key field from that record into your data-stream. As you keep going you'll use these keys as foreign keys as you import data into related tables.

rwilliams
  • 21,188
  • 6
  • 49
  • 55
  • @r-dub: Thanks Is it possible to automate Pentaho's ETL process (Kettle), meaning I create a data transformation workflow and then either have it running as a service, or run every XX mins? Also, in looking for a way to do this I have not seen a way so far to have Kettle handle mapping a table to an existing multi-table database; meaning it handles pairing the relationships between tables and the related normalization of data between said tables. I've looked over the documentation and a related book's TOC, and can't find anything about doing this with Kettle; linked to in the next comment. – blunders Nov 09 '10 at 18:23
  • **Kettle DOCs:** http://wiki.pentaho.com/display/EAI/Latest+Pentaho+Data+Integration+%28aka+Kettle%29+Documentation **Kettle BOOK:** https://www.packtpub.com/toc/pentaho-32-data-integration-beginners-guide-table-contents – blunders Nov 09 '10 at 18:25
  • 1
    You can run tasks with something like cron or task scheduler. Also when it comes to multiple tables you shouldn't have any problems. As you add rows to the 'top level' table you'll be appending their primary keys to your data set which is then used for foreign keys as you populate the child tables. – rwilliams Nov 09 '10 at 18:31
  • @r-dub: +2, plus selected as answer... Cool, thanks -- just wanted to make sure, selecting you as the answer, thanks for the quick reply! – blunders Nov 09 '10 at 19:14
  • @rwilliams aka r-dub: So, spent 12-16 hours so far trying to get Kettle to execute your suggested approach, but haven't even gotten it to work yet. Posted this question with sample data and a MySQL DDL schema. http://stackoverflow.com/questions/4175566/using-pentaho-kettle-how-do-i-load-multiple-tables-from-a-single-table-while-kee If you have any further suggestions, it'd be a huge help. – blunders Nov 14 '10 at 15:42
  • 1
    I'll take a look at it and try to give you a sample transformation – rwilliams Nov 14 '10 at 15:49
  • @rwilliams aka r-dub: +1 Great, thank you! So far I've only gotten it to read from a single table (in this case excel) and write to a single table in MySQL, with a row count added. Hacked that together using code samples from this page: http://type-exit.org/adventures-with-open-source-bi/2010/07/creating-a-basic-mondrian-olap-cube/ There's clear docs on the Combo-L/U step here: http://type-exit.org/adventures-with-open-source-bi/2010/07/a-basic-mondrian-cube-introducing-the-star-schema/ if I get this to work, I'll be able to use all over the place to help me convert flat DBs to RDBMS. – blunders Nov 14 '10 at 17:20
  • **@ALL:** To anyone reading this, a working answer was posted by r-dub here: http://stackoverflow.com/questions/4175566/using-pentaho-kettle-how-do-i-load-multiple-tables-from-a-single-table-while-kee – blunders Nov 14 '10 at 21:50