Convert MySql column with multiple values to proper table

Question

I have a dataset in the form of a CSV file than is sent to me on a regular basis. I want to import this data into my MySql database and turn it into a proper set of tables. The problem I am having is that one of the fields the is used to store multiple values. For example the field is storing email addresses. It may one email address or it may have two, or three, or four, etc. The field contents would look something like this. "user1@domain.com,user2@domain.com,user3@domain.com".

I need to be able to take the undetermined number of values from each field and then add them into a separate table so that they look like this.

user1@domain.com  
user2@domain.com  
user3@domain.com

I am not sure how I can do this. Thank you for the help.

possible duplicate of ["Reverse GROUP\_CONCAT" in MySQL?](http://stackoverflow.com/questions/17308669/reverse-group-concat-in-mysql) — jpw, Oct 27 '14 at 21:22

score 1 · Answer 1 · answered Oct 27 '14 at 21:41

Probably the simplest way is a brute force approach of inserting the first email, then the second, and so on:

insert into newtable(email)
    select substring_index(substring_index(emails, ',', 1), ',', -1)
    from emails
    where (length(replace(emails, ',', ',,')) - length(emails)) >= 1;

insert into newtable(email)
    select substring_index(substring_index(emails, ',', 2), ',', -1)
    from emails
    where (length(replace(emails, ',', ',,')) - length(emails)) >= 2;

insert into newtable(email)
    select substring_index(substring_index(emails, ',', 3), ',', -1)
    from emails
    where (length(replace(emails, ',', ',,')) - length(emails)) >= 3;

And so on.

That is, extract the nth element from the list and insert that into the table. The where clause counts the number of commas in the list, which is a proxy for the length of the list.

You need to repeat this up to the maximum number of emails in the list.

score 0 · Accepted Answer · answered Oct 28 '14 at 23:27

Instead of importing the csv file directly and then trying to fix the problems in it, I found the best way to attack this was to first pass the csv to AWK.

AWK outputs three separate csv file that follow the normal forms. I then import those tables and all is well.

2 info="`ncftpget -V -c -u myuser -p mypassword ftp://fake.com/data_map.csv`"
3
4 echo "$info" | \
5 awk -F, -v OFS="," 'NR > 1 {
6   split($6, keyvalue, ";")
7     for (var in keyvalue) {
8       gsub(/.*:/, "", keyvalue[var])
9       print $1, keyvalue[var]
10 }}' > ~/sqlrw/table1.csv
11
12 echo "$info" | \
13 awk -F, -v OFS="," 'NR > 1 {
14   split($6, keyvalue, ";")
15     for (var in keyvalue) {
16       gsub(/:/, ",", keyvalue[var])
17       print keyvalue[var]
18 }}' > ~/sqlrw/table2.csv
19
20 sort -u ~/sqlrw/table2.csv -o ~/sqlrw/table2.csv
21
22 echo "$info" | \
23 awk -F, -v OFS="," 'NR > 1 {
24       print $1, $2, $3, $4, $5, $7, $8
25 }' > ~/sqlrw/table3.csv

score 0 · Answer 3 · answered Oct 28 '14 at 23:57

Maybe using a simple php script would/shoud do the trick

<?php
    $file = file_get_contents("my_file.csv");
    $tmp = explode(";", $file); // iirc lines in csv are terminated by a ;

    for ($i=0; $i<count($tmp); $i++)
    {
        $field = $tmp[$i];

        $q = "INSERT INTO my_table (emails) VALUES (`$field`)";
        // or use $i as an id if don't have an autoincrement
        $q = "INSERT INTO my_table (id, emails) VALUES ($i, `$field`)";

       // execute query ....

    }
?>

Hope this helps even if it's not pure SQL .....

Convert MySql column with multiple values to proper table

3 Answers3