2

I'm just starting with Python, and am working with the following code to import a CSV file into an sqlite3 table, which I freely admit I've copied most of from the internet:

with open(getPathTo('my.csv'), 'r') as csvfile:
  reader = csv.DictReader(csvfile)
  records = [(row['SEGMENT'], row['Comp 1'], row['Comp 2']) for row in reader]
c.executemany("INSERT INTO comparison (`SEGMENT`, `Comp 1`, `Comp 2`) VALUES (?,?,?);", records)
conn.commit()

It works fine, but I'm repeating this for a lot of files and tables, and I'd like to turn it into a function. What I'm aiming for is something like:

def importCSVToTable(file, table, columns)

But, given a list of columns, how can I use it in this line:

records = [(row['SEGMENT'], row['Comp 1'], row['Comp 2']) for row in reader]

I'm just a bit thrown by the syntax, I think.

Ani Menon
  • 27,209
  • 16
  • 105
  • 126
Wintermute
  • 2,973
  • 4
  • 32
  • 52

4 Answers4

2

Here's some example code showing something that might be helpful. We use a nested comprehension with a filtering if statement to make sure we're not trying to access nonexistent dict items.

In [3]: def importCSVtoTable(file, table, columns):
   ...:     # 'mock' data to simulate a reader
   ...:     reader = [{'SEGMENT': 2, 'Comp 1': 'dogs'}, {'Comp 2': 'cats', 'OTHERTHING': 4}
   ...:     print [[row[label] for label in columns if label in row] for row in reader]
   ...:

In [4]: importCSVtoTable(None, None, ['SEGMENT', 'Comp 1'])
[[2, 'dogs'], []]

In [5]: importCSVtoTable(None, None, ['SEGMENT', 'Comp 1', 'Comp 2'])
[[2, 'dogs'], ['cats']]
a p
  • 3,098
  • 2
  • 24
  • 46
1

So I think what you're asking is, given a list of keys (columns), how do I extract them from a dictionary? Let's just use an in-memory CSV file to test this:

>>> example_data = """col1,col2,col3\na1,a2,a3\nb1,b2,b3\nc1,c2c3"""
>>> print example_data
col1,col2,col
a1,a2,a3
b1,b2,b3
c1,c2c3

Then, if we have a csvreader based on that:

>>> import csv, StringIO
>>> reader = csv.DictReader(StringIO.StringIO(example_data))
>>> print reader.fieldnames
['col1', 'col2', 'col3'] 

So, if we want to iterate based on that list of fields:

>>> for row in reader:
...     print 'insert into mytable (%s) values (%s)' % (','.join(reader.fieldnames), ','.join(['?']*len(reader.fieldnames)))
insert into mytable (col1,col2,col3) values (?,?,?)
insert into mytable (col1,col2,col3) values (?,?,?)
insert into mytable (col1,col2,col3) values (?,?,?)

From there, obviously, you'll want to adapt it for your function. But does that answer your question about the mechanics of csvreader and manipulating Python lists?

(Note, this is for Python 2.)

Ken Kinder
  • 12,654
  • 6
  • 50
  • 70
  • It's not exactly what I was looking for - I wanted to see if I could adapt the one line in my existing code to work with a list - but it's certainly useful nonetheless, thank you. – Wintermute Apr 23 '16 at 13:20
0

perhaps something like this? untested of course.

def importCSVToTable(conn, filename, table, columns)
    with open(getPathTo(filename), 'r') as csvfile:
        reader = csv.DictReader(csvfile)
        records = []
        for row in reader:
            for col in columns:
                records.append(row[col])
    conn.executemany("INSERT INTO comparison (" + ','.join(columns) + ") VALUES ("+ ','.join(["?"]*columns.length) +");", records)
    conn.commit()
karina
  • 805
  • 5
  • 10
  • Yeah, that would do it but I wanted a way to do it all on one line, which the answer above does. – Wintermute Apr 23 '16 at 13:18
  • @Wintermute k. But I feel, though presumably very pythonic, jamming so much info and control flow into one line is not very maintainable. But to each his own I guess – karina Apr 23 '16 at 13:22
-2

In the function get the list of columns and then for each element in the list find row[list-element] and append to a list to for the record.

Ani Menon
  • 27,209
  • 16
  • 105
  • 126
  • This gets all of the columns; he seems to want to be able to choose which get pulled out of CSV. – a p Apr 22 '16 at 16:51
  • No, I think he wants to pull all columns. – Ani Menon Apr 22 '16 at 16:52
  • No, I want to choose which ones get pulled, as specified in a `columns` list I'll pass into the function. What I'm struggling with is how to pass that columns list into the line that populates the `records` variable – Wintermute Apr 22 '16 at 16:53
  • @Wintermute Then just pass column names as a list like `['a','b','c']` and in the function as I said use a for each to find `row[list-element]`. – Ani Menon Apr 22 '16 at 16:56
  • Why the down votes? @A P has put up code for my explanation. – Ani Menon Apr 22 '16 at 16:57
  • Hi. I don't mean to sound snide or ungrateful, but I downvoted this because while it is, as you say, a valid explanation of code that someone else provided, what I was struggling with was the syntax, not the concept. Your answer didn't tell me anything I didn't know. – Wintermute Apr 23 '16 at 13:25
  • @Wintermute You hadn't specified you didn't know to code. I thought you wanted the explanation. – Ani Menon Apr 23 '16 at 14:05