New to Python, sorry if this is too easy, I usually work with R but want to try out this. I am trying to convert a csv file with student numbers, course ID(in total 7 courses) and the rating into a dictionary. It is different than the other questions since the key in the csv file is not a unique value, it is duplicated randomly based on how many courses this student evaluated. The sample data look like this:
participant_id;course_id;rating
103;4;2
104;5;3.5
104;7;2.5
108;3;3.5
108;5;2
114;2;4.5
114;5;3.5
114;7;4.5
116;1;2
116;2;3
116;3;3
116;4;4
126;5;3
129;1;4
129;5;3.5
135;1;4.5
so the optimal outcome would look like this, student numbers would be the key and value would be a list, with course_id as the index of the list and rating as the value. The rest is just NA.
{'103': ['NA', 'NA', 'NA', 2.0, 'NA', 'NA', 'NA'],
'104': ['NA', 'NA', 'NA', 'NA', 3.5, 'NA', 2.5],
'108': ['NA', 'NA', '3.5, 'NA',2.0', 'NA', 'NA'],
'114': ['NA', 4.5, 'NA', 'NA', 3.5, 'NA', '4.5],
'116': [2.0, 3.0, 3.0, 4.0, 'NA', 'NA', 'NA'],
'126': ['NA', 'NA', 'NA', 'NA', 3.0, 'NA', 'NA'],
'129': [4.0, 'NA', 'NA', 'NA', '3.5, 'NA', 'NA'],
'135': [4.5, 'NA', 'NA', 'NA', 'NA', 'NA', 'NA']}
I tried to extract the student number using set() and now I have the unique value for each student number and all I can do is to make a list with the right key but all the course ratings are NA because I don't know how to extract the course_id and rating in groups and put them into the list. Here is my code so far:
def ratings(filename):
with open(filename) as fp:
buffer = fp.readlines()
stu_id = []
dic = {}
for i in (buffer):
stu_id.append(i.split(';')[0])
stu_id_set = list(set(stu_id))
for j in stu_id_set:
dic[j] = ['NA','NA','NA','NA','NA','NA','NA']
return dic