2

Is it possible to read data from a csv file into a dictionary, such that the first row of a column is the key and the remaining rows of that same column constitute the value as a list?

E.g. I have a csv file

     strings, numbers, colors 
     string1, 1, blue
     string2, 2, red
     string3, 3, green
     string4, 4, yellow

using

with open(file,'rU') as f: 
    reader = csv.DictReader(f)
    for row in reader:
        print row

I obtain

{'color': 'blue', 'string': 'string1', 'number': '1'}
{'color': 'red', 'string': 'string2', 'number': '2'}
{'color': 'green', 'string': 'string3', 'number': '3'}
{'color': 'yellow', 'string': 'string4', 'number': '4'}

or using

 with open(file,'rU') as f: 
        reader = csv.reader(f)
        mydict = {rows[0]:rows[1:] for rows in reader}
        print(mydict)

I obtain the following dictionary

{'string3': ['3', 'green'], 'string4': ['4', 'yellow'], 'string2': ['2', 'red'], 'string': ['number', 'color'], 'string1': ['1', 'blue']}

However, I would like to obtain

{'strings': ['string1', 'string2', 'string3', 'string4'], 'numbers': [1, 2, 3,4], 'colors': ['red', 'blue', 'green', 'yellow']}
Vicky
  • 127
  • 1
  • 3
  • 9
  • possible duplicate of [Creating a dictionary from a csv file?](http://stackoverflow.com/questions/6740918/creating-a-dictionary-from-a-csv-file) – d33tah Jun 06 '15 at 17:01
  • No that post has unique key value pairs for each row. – Vicky Jun 06 '15 at 17:05
  • `csv.DictReader` is probably not the best of the choices - if you follow the question above and play around a bit - you should get your answer. For example you collect `row[0]` in a key and everything else in values `rows[1:]`. Now give it a try. – gabhijit Jun 06 '15 at 17:14
  • I've been trying for a few hours already, but perhaps I described my problem badly. What I am trying to do is to collect col[0] in a key and col[1:] in values. But I can't loop over the columns of the reader object, only over the rows. This is what I am struggling with. – Vicky Jun 06 '15 at 17:44

3 Answers3

8

You need to parse the first row, create the columns, and then progress to the rest of the rows.

For example:

columns = []
with open(file,'rU') as f: 
    reader = csv.reader(f)
    for row in reader:
        if columns:
            for i, value in enumerate(row):
                columns[i].append(value)
        else:
            # first row
            columns = [[value] for value in row]
# you now have a column-major 2D array of your file.
as_dict = {c[0] : c[1:] for c in columns}
print(as_dict)

output:

{
    ' numbers': [' 1', ' 2', ' 3', ' 4'], 
    ' colors ': [' blue', ' red', ' green', ' yellow'],
    'strings': ['string1', 'string2', 'string3', 'string4']
}

(some weird spaces, which were in your input "file". Remove spaces before/after commas, or use value.strip() if they're in your real input.)

tehwalrus
  • 2,589
  • 5
  • 26
  • 33
4

This is why we have the defaultdict

from collections import defaultdict
from csv import DictReader

columnwise_table = defaultdict(list)
with open(file, 'rU') as f:
    reader = DictReader(f)
    for row in reader:
        for col, dat in row.items():
            columnwise_table[col].append(dat)
print columnwise_table
Semmel
  • 2,526
  • 3
  • 21
  • 30
Pete Cacioppi
  • 880
  • 7
  • 11
  • Thanks for the edit. It is my habit to replace `defaultdict` with `dict` when I am done taking advantage of the default behavior, my code as originally written wasn't doing exactly that so better to KISS. – Pete Cacioppi Sep 12 '17 at 17:04
  • hi Peter what would be the way to achieve the same result using csv. What I mean is open multiple columns csv into a dictionary? – Ian_De_Oliveira Aug 22 '18 at 06:46
0

Yes it is possible: Try it this way:

import csv
from collections import defaultdict

D=defaultdict(list)
csvfile=open('filename.csv')
reader= csv.DictReader(csvfile)  # Dictreader uses the first row as dictionary keys
for l in reader:                 # each row is in the form {k1 : v1, ... kn : vn}
    for k,v in l.items():  
        D[k].append(v)       
...................
...................

Assuming filename.csv has some data like

strings,numbers,colors 
string1,1,blue
string2,2,red
string3,3,green
string4,4,yellow

then D will result in

defaultdict(<class 'list'>, 
            {'numbers': ['1', '2', '3', '4'], 
             'strings': ['string1', 'string2', 'string3', 'string4'], 
             'colors':  ['blue', 'red', 'green', 'yellow']})