First a bit of advice: your question has all the right content, but the phrasing is quite poor. I am answering it because of the former, but I feel the need to point out the latter so you can avoid getting so many close votes in the future. "Any ideas to get me started would be very much appreciated!" and "Can anyone help?" are not valid questions for SO. The problem here is that they are fluff that detracts from the real question, to the point that most reviewers will see them as trigger phrases. In your case, you actually have a good clear problem statement, a coding attempt that is nearly spot-on, and all you need is help with a specific exception. Next time, phrase your question to be about your error or actual problem, and stay away from vagueness like "can you help?".
Enough of that.
A CSV reader is an iterable over the rows of the CSV. Each row is a list. Therefore, when you do list(reader)
, you are actually creating a list of lists. In your case, each list contains only one element, but that is irrelevant to the Counter
: lists can't be dictionary keys, so you get your exception. Literally all you need to change is to extract the first element of each row before you pass it to the Counter
. Replace my_list = list(reader)
with any of the following:
my_list = list(r[0] for r in reader)
OR
my_list = [r[0] for r in reader]
OR
counter = collections.Counter(r[0] for r in reader)
The last one creates a generator expression that will be evaluated lazily. It is probably your best option for a very large input since it will not retain the entire data set in memory, only the histogram.
Since the generator is evaluated lazily, you can not evaluate the Counter
outside the with
block. If you attempt to do so, the file will already have been closed, and the generator will raise an error on the first iteration.
You might get a slight speed boost by using operator.itemgetter
instead of an explicit r[0]
in any of the expressions above. All combined, the example below is pretty close to what you already have:
import csv
from collections import Counter
from operator import itemgetter
with open ('test.csv','rb') as f:
reader = csv.reader(f)
g = itemgetter(0)
counter = Counter(g(r) for r in reader)
print(counter)