1

I have a folder that contains hundreds of csv files. Each file has date as it's name as data in my directory is being created daily, for example 2020-01-15.csv, 2020-01-16.csv, 2020-01-17.csv etc. I'm looking for a best way to import my files into mysql database daily and create table for each file (no need to create table if table with filename already exists).

So far in order to import my files into mysql database I've used mysqlimport, but I used this to import one file at the time, this time it looks like my knowledge about this function will not be enough, this is what I've tried so far in bash:

mysqlimport -h localhost -umyusername -pmypassword database_name /path/to/my/data/*.csv

Error received:

mysqlimport: Error: 1146, Table 'database_name.2020-01-15' doesn't exist, when using table: 2020-01-15

Could someone help me with this? Would there be an easier way doing this in python? Thanks in advance.

File single file structure:

['date,id,name,gsmCount,userCount,regionCount\n',
 '2020-01-25,g45ddf-54fdfd4,GammaY,22142,3212,132\n', 
 '2020-01-25,g412ddf-54re321d4,BetaT,351871,734,67\n',
 '2020-01-25,fsdsf579hhh-fgd4,LambdaD,367,41,7\n']

So this is my current script:

#!/bin/bash

# show commands being executed, per debug
set -x

# define database connectivity
_db="mydatabasename"
_db_user="myusername"
_db_password="mypassword"


# define directory containing CSV files
_csv_directory="/path/to/my/data"

# go into directory
cd $_csv_directory || exit

# edit file name
rename "s/ //g" *.csv
rename "s/^/tp/g" *.csv

# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`

# loop through csv files
for _csv_file in ${_csv_files[@]}
do

 # remove file extension
 _csv_file_extensionless=`echo "$_csv_file" | sed 's/\(.*\)\..*/\1/'`

 # define table name
 _table_name="${_csv_file_extensionless}"

 # get header columns from CSV file
 _header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' |   sed 's/"//' | sed 's/ /_/g'`
 _header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g' | sed 's/(//g' | sed 's/)//g'`

 # ensure table exists
 mysql -u $_db_user -p$_db_password $_db << eof
     CREATE TABLE IF NOT EXISTS \`$_table_name\` ENGINE=MyISAM DEFAULT CHARSET=utf8
eof

 # loop through header columns
 for _header in "${_header_columns[@]}"
 do

   # add column
   mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column IF NOT EXISTS \`$_header\` text"

 done

 # import csv into mysql
 mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' -- lines-terminated-by="\n" --columns=$_header_columns_string -u $_db_user - p$_db_password $_db $_csv_directory/$_csv_file

done
exit

and here is the error I'm receiving when running the above:

myserver:~ user_name$ bash -c -l "/path/to/my/script/uploadmysql.sh"
+ _db=mydatabasename
+ _db_user=myusername
+ _db_password=mypassword
+ _csv_directory=/path/to/my/data
+ cd /path/to/my/data
+ rename 's/ //g' 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
/path/to/my/script/uploadmysql.sh: line 19: rename: command not found
+ rename 's/^/tp/g' 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
/path/to/my/script/uploadmysql.sh: line 20: rename: command not found
++ ls -1 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
+ _csv_files='2020-01-25.csv
2020-01-26.csv
2020-01-27.csv'
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-25.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-25
+ _table_name=2020-01-25
++ head -1 /path/to/my/data/2020-01-25.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-25.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table  `2020-01-25` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax;  check the manual that corresponds to your MySQL server version for the  right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' -- columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-25.csv
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-25' doesn't exist, when using table: 2020-01-25
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-26.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-26
+ _table_name=2020-01-26
++ head -1 /path/to/my/data/2020-01-26.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-26.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table   `2020-01-26` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--  lines-terminated-by=\n' --  columns=date,id,Name,gsmCount,userCount,regionCount -u myusername -  pmypassword mydatabase /path/to/my/data/2020-01-26.csv
mysqlimport: [Warning] Using a password on the command line interface can  be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-26' doesn't exist,  when using table: 2020-01-26
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-27.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-27
+ _table_name=2020-01-27
++ head -1 /path/to/my/data/2020-01-27.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-27.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table     `2020-01-27` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be   insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' --columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-27.csv
 mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-27' doesn't exist,      when using table: 2020-01-27
+ exit

still trying to get rid of errors and have multiple CSV files imported as tables into mysql. Could someone give me a hint how to fix the issues? thanks in advance

Baobab1988
  • 685
  • 13
  • 33
  • Can you show us structure of your csv file? – Malkhazi Dartsmelidze Feb 02 '20 at 19:07
  • Hi @Malkhazi I just added file structure. All the files have exactly same column names and count. Would you be able to help me with import? thank you in advance. – Baobab1988 Feb 02 '20 at 19:22
  • You could this loop https://stackoverflow.com/a/20796617/5193536 to get the file name, create a table, but this is not possible as far as i can see. But of course you could use python to do the job for you – nbk Feb 02 '20 at 19:22
  • @nbk any idea how to start with python in order to tackle this scenario? thanks! – Baobab1988 Feb 02 '20 at 19:27
  • sure start with getting filenames https://stackoverflow.com/questions/3207219/how-do-i-list-all-files-of-a-directory and then take it from there. – nbk Feb 02 '20 at 19:30
  • You have three steps # generate a table name from the file name # execute a create table command for the new table # execute your mysqlimport command to import that data for that file into the new table I would probably do this with a shell script rather than python, but you could do it in any language. – Rob Sweet Feb 02 '20 at 19:30
  • @Rob and nbk I guess I'm close to solve the problem. Added my code and errors above, would you know how to get this to work? I assume there is problem as I'm trying to create table with no columns first and only then running query which suppose to add columns separately? – Baobab1988 Feb 02 '20 at 22:32

1 Answers1

0

This example for execsql.py (https://pypi.org/project/execsql/) shows how to get all the file names in a directory, loop over them, and import each one into its own table: http://execsql.osdn.io/examples.html#example-13-import-all-the-csv-files-in-a-directory. This example is written for Postgres rather than MySQL, and puts the tables in a staging directory (named "staging"), but it can be easily modified to work with MySQL.

Disclaimer: I wrote execsql.

rd_nielsen
  • 2,407
  • 2
  • 11
  • 18