2

I have a csv file which is having lot of serial numbers and material numbers for ex: show below (I need only first 2columns i.e serial and chassis and rest is not required).

serial          chassis      type   date
ZX34215         Test         XX     YY
ZX34215         final-001    XX     YY
AB30000         Used         XX     YY
ZX34215         final-002    XX     YY

I have below snippet which gets all the serial and material numbers into a dictionary but here duplicate keys are eliminated and it captures latest serial key.

Working code

import sys
import csv 
with open('file1.csv', mode='r') as infile:
        reader = csv.reader(infile)
        mydict1 = {rows[0]:rows[1] for rows in reader}
        print(mydict1)

I need to capture duplicate keys with respective values also but it failed. I used python defaultdict and looks like I missed something here.

not working

from collections import defaultdict
with open('file1.csv',mode='r') as infile:
    data=defaultdict(dict)
    reader=csv.reader(infile)
    list_res = list(reader)
    for row in reader:
        result=data[row[0]].append(row[1])
        print(result)

Can some one correct me to capture duplicate keys into dictionary.

Community
  • 1
  • 1
Vinod HC
  • 1,557
  • 5
  • 20
  • 38
  • Possible duplicate of [Remove specific characters from a string in python](http://stackoverflow.com/questions/3939361/remove-specific-characters-from-a-string-in-python) – SuReSh Mar 10 '16 at 11:22

1 Answers1

4

You need to pass a list to your defaultdict not dict :

data=defaultdict(list)

Also you don't need to convert the reader object to list, for iterating over it, you also shouldn't assign the append snipped to a variable in each iteration:

data=defaultdict(list)
with open('file1.csv') as infile:
    reader=csv.reader(infile)
    for row in reader:
        try:      
            data[row[0]].append(row[1])
        except IndexError:
            pass
    print(data)
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • Hi I have many columns which i am igonring and need to consider only first 2 columns in csv (serial and chassis) so in this case for my actual csv file, I get below error: for col1, col2 in reader: ValueError: too many values to unpack (expected 2) – Vinod HC Mar 10 '16 at 11:34
  • @VinodHC It's because of that your rows haven't same number of items. You can use a `try-except` expression to handle the error. – Mazdak Mar 10 '16 at 11:43
  • Hi I have some empty serial number rows in csv file and I get below error, How to ignore this and continue: for row in reader: _csv.Error: line contains NULL byte – Vinod HC Mar 13 '16 at 14:11
  • @VinodHC http://stackoverflow.com/questions/4166070/python-csv-error-line-contains-null-byte – Mazdak Mar 13 '16 at 14:13