0

I have a csv file - file1.csv, which has 3 columns in each row. The sample looks like this:

A,d1,200
A,d2,250
A,d3,10
B,d1,100
B,d2,150
B,d4,45
.
.
.

The structure of above data is - loacation_id,dept_id,num_emp.Now what I want to do is break the records of the csv file into chunks based on the 1st column value so that in one chunk there are records for only location, and then pass these chunks to a function one by one. I wrote this code based on this SO post but I am getting error as TypeError: 'itertools._grouper' object has no attribute '__getitem__'. My current code is:

import csv
from itertools import groupby

def func(chunk):

    for line in chunk:
        print line

file_read = open('file1.csv', 'r')
reader = csv.reader(file_read)

for rows in groupby(reader):
    func(rows)

How can I break the records into chunks based on values in one column and pass the chunks to a function?

Community
  • 1
  • 1
user2966197
  • 2,793
  • 10
  • 45
  • 77

1 Answers1

1

How about the following approach, this will read in your csv file and display the information grouped by the first column:

import csv
import itertools

def display_group(group):
    print "Group {}".format(group[0][0])

    for entry in group:
        print entry

groups = []
location_ids = []

with open('file1.csv', 'r') as f_input:
    csv_input = csv.reader(f_input)

    for k, g in itertools.groupby(csv_input, key=lambda x: x[0]):
        groups.append(list(g))
        location_ids.append(k)

print "Location IDs:", location_ids

for group in groups:            
    display_group(group)

This would display the following with your data:

Location IDs: ['A', 'B']
Group A
['A', 'd1', '200']
['A', 'd2', '250']
['A', 'd3', '10']
Group B
['B', 'd1', '100']
['B', 'd2', '150']
['B', 'd4', '45']
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
  • does group[0][0] gives me the group key? What i mean if in my dataset there are 5 groups (5 different location_id) so is there any way I can store them say in a list while forming groups? – user2966197 Sep 18 '15 at 16:13
  • It gives you the first column of the first entry in a group, so yes it will give you the group key. – Martin Evans Sep 18 '15 at 16:19
  • I have update the script to also store a separate list of the location IDs which are printed first. – Martin Evans Sep 18 '15 at 16:24