5

I want to read a CSV file's columns directly into variables. The result should be something like you would get with the following shell line: while IFS=, read ColumnName1 ColumnName2 ColumnName3 do stuff

So far the answer seems to be with csv.DictReader, but I have not been able to make it work. I do not have a header row, so column names would have to be created manually. (with a dictionary I think in the form mydictionary={ 'ColumnName1':0, 'ColumnName2':1, 'ColumnName3':3 } )

Also, can the columns be referenced as simple variable names or must you use a list[index] style reference. A code sample just printing the columns by name would be nice. Thanks for the help.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Wyatt
  • 153
  • 2
  • 3
  • 10

6 Answers6

7

The built-in CSV Module is quite useful when working with csv files.


Oh, nevermind, you must be using it already, if you are looking at DictReader.

The usual way I deal with files that have no header would be to read the first line, parse it for the number of commas (and hence the number of columns) then set up my dictionary/list to contain the values from the csv file (using number of columns and giving each column a name in my my code.) I can provide an example if necessary, it's pretty straightforward.


I think I better understand your question, is this more what you are looking for?:

mydictionary={ 'ColumnName1':[dataRow1Col1, dataRow2Col1, dataRow3Col1], 
               'ColumnName2':[dataRow1Col2, dataRow2Col2, dataRow3Col2], 
               'ColumnName3':[dataRow1Col3, dataRow2Col3, dataRow3Col3] }

In which case, something like this may work:

import csv
Col1 = "ColumnName1"
Col2 = "ColumnName2"
Col3 = "ColumnName3"
mydictionary={Col1:[], Col2:[], Col3:[]}
csvFile = csv.reader(open("myfile.csv", "rb"))
for row in csvFile:
  mydictionary[Col1].append(row[0])
  mydictionary[Col2].append(row[1])
  mydictionary[Col3].append(row[2])
chisaipete
  • 884
  • 9
  • 20
  • 1
    Why would you ever count up the number of columns, just to set up variables for it? If the file doesn't have enough columns for your code then no amount of setting up dictionaries or lists would put enough into it. – Ignacio Vazquez-Abrams Jun 15 '11 at 00:20
  • My description was general and not intended to be all inclusive of certain corner cases or all parameter checking that must be done. I was going for clarity and understanding of methodology, not exact implementation. – chisaipete Jun 15 '11 at 00:25
3
for row in thingthatyieldslists:
  col1, col2, col3 = row
  print "%s: %s, %s" % (col1, col2, col3)
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • It doesn't. You said you wanted the "CSV file's columns directly into variables". That is what I have done. – Ignacio Vazquez-Abrams Jun 14 '11 at 23:13
  • Could you give more detail. I am getting Value error: too many values to unpack. Everything I have read seems to say to use the CSV module to read csv files and dictreader in particular. Are you familiar with this? – Wyatt Jun 14 '11 at 23:23
  • 1
    `DictReader` is pointless if, as you say, the file has no header. And the code I gave is for files with 3 columns; use fewer or more variables as the situation calls for. – Ignacio Vazquez-Abrams Jun 14 '11 at 23:24
2

Is this what you were looking for (Python 3.X):

import csv
from io import StringIO

# Simulate a csv data file with no header
data = StringIO('''\
Mark,44,1
Joe,22,0
Craig,39,3
''')

for row in csv.DictReader(data,'Name Age Children'.split()):
    print(row)

Output

{'Age': '44', 'Name': 'Mark', 'Children': '1'}
{'Age': '22', 'Name': 'Joe', 'Children': '0'}
{'Age': '39', 'Name': 'Craig', 'Children': '3'}

Or maybe:

import csv
from io import StringIO

# Simulate a csv data file with no header
data = StringIO('''\
Mark,44,1
Joe,22,0
Craig,39,3
''')

# 1. Read in all the data
# 2. Transpose into columns
# 3. Match with column names
# 4. Create dictionary
cols = dict(zip('Name Age Children'.split(),zip(*csv.reader(data))))
print(cols)

Output

{'Age': ('44', '22', '39'), 'Name': ('Mark', 'Joe', 'Craig'), 'Children': ('1', '0', '3')}
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
1

I recognize this post is almost two years old, but I want to begin contributing positively to stackoverflow.

I assume you want this to be more general than just three columns, but that efficiency doesn't matter.

import csv
f = csv.reader(open('my file.csv','r'))
e = f.next() #Give us a sample of our data
ecount = len(e)
for i in range(ecount): #use range here to keep variable order in CSV preserved in naming
    vars()['ColumnName'+str(i+1)]] = list(e[i]) # makes a list for each item 
for j in f:
    for l in range(ecount): 
        vars()['ColumnName'+str(l+1)].append(j[l]) #Reads in variable values to list
1

Thank you all for working with my question. Here is what I ended up doing. Simple, but it worked. Again, problem was to read a headless CSV into variables so that I could "do stuff"

import sys

for record in sys.stdin.readlines():
    record = record.rstrip()
    ColumnName1, ColumnName2, ColumnName2 = record.split(',')

This does the same thing as the shell code I posted in the question: while IFS=, read ColumnName1 ColumnName2 ColumnName3 do stuff

Thanks for all the help. I will be asking more questions soon!!

Karl
  • 1,664
  • 2
  • 12
  • 19
Wyatt
  • 153
  • 2
  • 3
  • 10
0

Here's a "dictreader" for a headless csv, each row will be a dictionary with sequential keys 'column_0', 'column_1', 'column_2' and so on...

import csv

csvfile = list(csv.reader(open('data.csv')))

csvdics = []

for row in csvfile:
    row_dict = {}
    for i in xrange(len(row)):
        row_dict['column_%s' % i] = row[i]
    csvdics.append(row_dict)

Or, if you know ahead of time what the column names should be, you can pass them in a list as a second argument to DictReader.

lebowski
  • 316
  • 2
  • 11