I have a dataset of purchased items of a user in multiple sessions, as follow:
user session items quantity
1, 3, item5, 2
1, 3, item4, 1
1, 3, item2, 1
1, 3, item5, 2
1, 14, item2, 1
1, 14, item4, 1
2, 8, item1, 1
2, 8, item3, 1
2, 8, item4, 3
2, 9, item4, 3
I want to put a frequency of each item in a dataframe as:
item1 item2 item3 item4 item5
user1 NaN 2 NaN 2 4
user2 1 NaN 1 6 NaN
I tried to group items for each user and count (using dictionary,{item2: 2, item4: 2, item5: 2}
), but the real quantity of (item5) is 4 instead of 2.
temp = set(sessions_bought_items)
dic ={}
for j in temp:
dic[j] = sessions_bought_items.count(j)
df = pd.DataFrame(dic,index = [user],columns = [dic_keys for dic_keys in dic.keys()])
I also tried, pivot_table(values=quantity, index=user, columns=items)
, but it takes one of repeated value of (item4) 3 instead of 6.
The problem in counting the final quantity of items for each user.