0

I am trying to use a CSV in order to fill a 34 columns SQL database by using Python, even though I can't.

import csv sqlite3
con = sqlite3.connect(":memory:")
cur = con.cursor()

cur.execute("CREATE TABLE t (No, Source, Host, Link, Date, Time, time2, Category, AuthorId, AuthorName, AuthorUrl, Auth, Followers, Following, Age, Gender, Language, Country, Province, City, Location, Sentiment, Title, Snippet, Description, Tags, Contents, View, Comments, Rating, Favourites, Duration, Bio, UniqueId);")}

with open('database.csv', 'rb') as fin:
    dr = csv.reader(fin) 
    dicts = ({'No': line[0], 'Source': line[1], 'Host': line[2], 'Link': line[3], 'Date': line[4], 'Time': line[5], 'time2': line[6], 'Category': line[7], 'AuthorId': line[8], 'AuthorName': line[9], 'AuthorUrl': line[10], 'Auth': line[11], 'Followers': line[12], 'Following': line[13], 'Age': line[14], 'Gender': line[15], 'Language': line[16], 'Country': line[17], 'Province': line[18], 'City': line[19], 'Location': line[20], 'Sentiment': line[21], 'Title': line[22], 'Snippet': line[23], 'Description': line[24], 'Tags': line[25], 'Contents': line[26], 'View': line[27], 'Comments': line[28], 'Rating': line[29], 'Favourites': line[30], 'Duration': line[31], 'Following': line[32], 'UniqueId': line[33]} for line in dr)
    to_db = ((i['No'], i['Source'], i['Host'], i['Link'], i['Date'], i['Time'], i['time2'], i['Category'], i['AuthorId'], i['AuthorName'], i['AuthorUrl'], i['Auth'], i['Followers'], i['Following'], i['Age'], i['Gender'], i['Language'], i['Country'], i['Province'], i['City'], i['Location'], i['Sentiment'], i['Title'], i['Snippet'], i['Description'], i['Tags'], i['Contents'], i['View'], i['Comments'], i['Rating'], i['Favourites'], i['Duration'], i['Bio'], i['UniqueId']) for i in dicts)

cur.executemany("INSERT INTO t VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", to_db)
con.commit()

I've been following many indications, although it is my first time pythoning and I don't know how to do that.

Could you please help me with this? Thanks a lot in advanced.

Pd: In case it is not inferable, the csv file is without a header, and I am trying to fill column by column at once.

  • In what way is it going wrong? Can you provide a simple example of your input data, so that we might understand the issues you face? Your code looks pretty screwy and 34 columns seems like a lot for one table :) – Paul Rooney Jul 07 '15 at 12:31
  • BTW, Python is perfect if you need to apply some logic while importing these data, but if you just need raw importing then many DBMS support this feature out-of-the-box. http://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgres-table – MGP Jul 07 '15 at 13:36

2 Answers2

0

If the CSV elements are positionally correct, can you not do something more straight forward i.e. as an example with the following data

1,2,3
a,b,c

use the following;

import sqlite3
con = sqlite3.connect(":memory:")
cur = con.cursor()

cur.execute("CREATE TABLE t (col1,col2,col3);")

with open('database.csv', 'rb') as fp:
    for line in fp.readlines():
        cur.execute("INSERT INTO t VALUES (?, ?, ?)",line.strip().split(','))
con.commit()

for row in cur.execute("select * from t;"):
    print row
moraygrieve
  • 466
  • 4
  • 11
0

This works. I used a few short cuts to save on typing.

import csv
import sqlite3
import itertools

params = ['No', 'Source', 'Host', 'Link', 'Date', 'Time', 'time2', 'Category', 'AuthorId', 'AuthorName', 'AuthorUrl', 'Auth', 'Followers', 'Following', 'Age', 'Gender', 'Language', 'Country', 'Province', 'City', 'Location', 'Sentiment', 'Title', 'Snippet', 'Description', 'Tags', 'Contents', 'View', 'Comments', 'Rating', 'Favourites', 'Duration', 'Bio', 'UniqueId']

create_str = "CREATE TABLE t (%s);" % ', '.join('"%s"' % p for p in params)
insert_str = "INSERT INTO t VALUES (%s)" % ', '.join(itertools.repeat('?', len(params)))

with open('database.csv') as fin:
    dr = csv.DictReader(fin, fieldnames=params, skipinitialspace=True)
    lst = [tuple(d[p] for p in params) for d in dr]

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute(create_str)

cur.executemany(insert_str, lst)
con.commit()

for row in cur.execute("select * from t;"):
    print(row)

Note its bad practice to use string format operations to build up sql query strings. It can lead to sql injection attacks, if used with unknown input data. I am doing so here because the strings are only being built from known values and unknown input (that from the file) is built properly using the standard '?' placeholder with tuple passed to execute method.

Note also you have far too many parameters in one table. It should be more normalised across multiple tables, but I guess you will learn that at some point.

Paul Rooney
  • 20,879
  • 9
  • 40
  • 61
  • Than you a lot, I am trying to find why after using your code, there is this error: "You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings." – Francis Mescudi Jul 07 '15 at 13:57
  • Simple answer is don't open the file in binary mode. I adjusted my answer. It was only an issue in python 3. – Paul Rooney Jul 07 '15 at 21:38