1

I have a CSV file that looks something like this:

Date,Person,Time Out,Time Back,Restaurant,Calories,Delicious?
6/20/2016,August,11:58,12:45,Black Bear,850,Y
6/20/2016,Marcellus,12:00,12:30,Brought Lunch,,Y
6/20/2016,Jessica,11:30,12:30,Wendy's,815,N
6/21/2016,August,12:05,1:01,Brought Lunch,,Y

So far I have managed to print each row into a list of strings (ex. - ['Date', 'Person', 'Time Out', etc.] or ['6/20/2016', 'August', '11:58' etc.]).

Now I need to do 2 more things:

  1. Add an ID header and sequential numeric string to each row (for ex. - ['ID', 'Date', 'Person', etc.] and ['1', '6/20/2016', 'August', etc.])
  2. Separate each row so that they can be formatted into insert statements rather than just having the program print out every single row one after another (for ex. - INSERT INTO Table ['ID', 'Date', 'Person', etc.] VALUES ['1', '6/20/2016', 'August', etc.])

Here is the code that has gotten me as far as I am now:

import csv

openFile = open('test.csv', 'r')
csvFile = csv.reader(openFile)
for row in csvFile:
    print (row)
openFile.close()
ThoseKind
  • 754
  • 2
  • 13
  • 22
  • 2
    Is the ID column in your SQL table a primary key? If so you could rely on SQL's **auto_increment** property and just ignore the ID column in the insert query. – Mumpo Jun 30 '16 at 22:57
  • Why don't you use MySQL's built-in `LOAD DATA INFILE` to load directly from the CSV file to the database, instead of parsing it in Python? – Barmar Jul 01 '16 at 00:19
  • @Mumpo, Yes it is. I am not very familiar with MySQL so I did not know this was an option but seem far more viable than what I had listed above. Thanks for the tip. – ThoseKind Jul 01 '16 at 15:53
  • @Barmar, Mainly because there are some other things that I want to do with the file before inserting it into MySQL and figured it wouldn't be too hard to mess around with this first. However if ti doesn't work that is definitely and option. Thanks for putting that out there. – ThoseKind Jul 01 '16 at 15:55

3 Answers3

5

Try this (I ignored the ID part since you can use the mySQL auto_increment)

import csv

openFile = open('test.csv', 'r')
csvFile = csv.reader(openFile)
header = next(csvFile)
headers = map((lambda x: '`'+x+'`'), header)
insert = 'INSERT INTO Table (' + ", ".join(headers) + ") VALUES "
for row in csvFile:
    values = map((lambda x: '"'+x+'"'), row)
    print (insert +"("+ ", ".join(values) +");" )
openFile.close()
Declan Cook
  • 6,066
  • 2
  • 35
  • 52
Mumpo
  • 528
  • 7
  • 16
  • That worked. Thanks for the help! As a side note, do you know if there would be any way to get rid of unwanted spaces between values in the VALUES list? – ThoseKind Jul 01 '16 at 15:52
  • Those spaces were intentional, just replace the line before the last with this one: `print (insert +"("+ ",".join(values) +");" )` – Mumpo Jul 01 '16 at 16:43
  • Oh no I understand that part, I was more so referring to the actual strings of data themselves (for ex. - `'UNKNOWN ', to instead be 'UNKNOWN',)` – ThoseKind Jul 01 '16 at 17:04
  • You can use `.strip()` to achieve that. `values = map((lambda x: '"'+x.strip()+'"'), row)` – Mumpo Jul 01 '16 at 17:16
  • Again worked perfectly. And just to make sure I get my moneys worth, how would I go about getting this to work on a group of files in a directory rather than just one named file? I've tried using `glob.glob(filePath)` and then `csv.reader(filePathObj)` but instead of getting back the contents of the files I get back the names instead (for ex. - `(INSERT INTO Table 'C:/Users/etc') VALUES ('C:/Users/etc'))` – ThoseKind Jul 01 '16 at 20:14
  • But isn't this vulnerable to SQL injection attacks? – Kishan B Dec 27 '21 at 14:14
1

You can use this functions if you want to mantain type conversion, i have used it to put data into google big query with a string sql statement.

PS: You can put other types on the function

import csv

def convert(value):
    for type in [int, float]:
        try:
            return type(value)
        except ValueError:
            continue
    # All other types failed it is a string
    return value


def construct_string_sql(file_path, table_name, schema_name):
    string_SQL = ''
    try:
        with open(file_path, 'r') as file:            
            reader = csv.reader(file)
            headers = ','.join(next(reader))
            for row in reader:
                row = [convert(x) for x in row].__str__()[1:-1]
                string_SQL += f'INSERT INTO {schema_name}.{table_name}({headers}) VALUES ({row});'
    except:
        return ''

    return string_SQL 
0

You can use this open-source tool to generate batch INSERT statements: https://simranjitk.github.io/sql-converter/.

Simran
  • 329
  • 3
  • 6