I have a CSV file with headers at the top of columns of data as:
a,b,c
1,2,3
4,5,6
7,8,9
and I need to read it in a dict of lists:
desired_result = {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
When reading this with DictReader
I am using a nested loop to append the items to the lists:
f = 'path_to_some_csv_file.csv'
dr = csv.DictReader(open(f))
dict_of_lists = dr.next()
for k in dict_of_lists.keys():
dict_of_lists[k] = [dict_of_lists[k]]
for line in dr:
for k in dict_of_lists.keys():
dict_of_lists[k].append(line[k])
The first loop sets all values in the dict to the empty list. The next one loops over every line read in from the CSV file, from which DictReader
creates a dict of key-values. The inner loop appends the value to list matching the corresponding key, so I wind up with the desired list of dicts. I end up having to write this fairly often.
My question is, is there a more Pythonic way of doing this using built-in functions without the nested loop, or a better idiom, or an alternative way to store this data structure such that I can return an indexable list by querying with a key? If so is there also a way to format the data being ingested by column upfront?