1

I am attempting to calculate all variables of a specific value in a given column from an Excel document. I want to be able to iterate over the column and calculate the total of each instance... e.g. how many students received a grade "A".

Here is what I have so far...

test.xls:

Name, Class, Grade

James, Math, A

Judy, Math, A

Bill, Social Studies, B

Denice, History, C

Sarah, History, B

Here is my python script

import xlrd
from collections import Counter
sh = xlrd.open_workbook('test.xls', on_demand = True).sheet_by_index(0) # Open workbook and sheet

 for rownum in range(sh.nrows):
    grades = str(sh.cell(rownum, 2).value) # Grab all variables in column 2.
    print Counter(grades.split('\n'))  # Count grades

Expected output:

A = 2

B = 2

C = 1

Actual output:

Counter({'Grade': 1})

Counter({'A': 1})

Counter({'A': 1})

Counter({'B': 1})

Counter({'C': 1})

Counter({'B': 1})

As each grade is showing in a different list I have been unable to merge/concatenate lists to get a total. Also it is not in the desired output formatting.

John Doe
  • 13
  • 3

2 Answers2

0
for rownum in range(sh.nrows):
    grades = str(sh.cell(rownum, 2).value) # Grab all variables in column 2.
    print Counter(grades.split('\n'))  # Count grades

You are creating a list in every iteration.

You can use list comprehension to a create a single list with all the grades:

grades = [str(sh.cell(rownum, 2).value) for rownum in range(sh.nrows)]
print Counter(grades)

Or without comprehension :

grades = []
for rownum in range(sh.nrows):
    grades.append(str(sh.cell(rownum, 2).value))

print Counter(grades)

You would still need to format the content of Counter(grades) to your liking:

res = Counter(grades)
for grade, count in res.iteritems():
    print '{} = {}'.format(grade, count) 

Note that:

  1. I'm not using split.

  2. The output won't be in any particular order, and in fact might change between consecutive runs of the script.

DeepSpace
  • 78,697
  • 11
  • 109
  • 154
0

You can start by instantiating a Counter and then add grades to it while you iterate:

grades_counter = Counter()
mysheet = xlrd.open_workbook('grades.xls').sheet_by_index(0)

for i in range(1,mysheet.nrows):
    grades_counter += Counter(str(mysheet.row_values(i)[2]))

print grades_counter
Counter({'A': 2, 'B': 2, 'C': 1})

If you are looking to print the output in a more elegant way, you can do the following:

for k,v in grades_counter.items():
    print "{} = {}".format(k,v)

You should get:

A = 2
C = 1
B = 2

I hope this helps.

Abdou
  • 12,931
  • 4
  • 39
  • 42
  • 1
    This is nearly perfect, thank you! Also to complete the question, I have been able to format the counter using this answered question: http://stackoverflow.com/questions/20316299/formatting-output-of-counter#20316330 – John Doe Aug 18 '16 at 13:57