0

I have a dataset of purchased items of a user in multiple sessions, as follow:

user session items quantity
1,    3,    item5,  2
1,    3,    item4,  1
1,    3,    item2,  1
1,    3,    item5,  2
1,    14,   item2,  1
1,    14,   item4,  1

2,     8,   item1,  1
2,     8,   item3,  1
2,     8,   item4,  3
2,     9,   item4,  3

I want to put a frequency of each item in a dataframe as:

       item1     item2    item3   item4    item5
user1   NaN        2       NaN       2        4
user2    1        NaN       1        6       NaN

I tried to group items for each user and count (using dictionary,{item2: 2, item4: 2, item5: 2}), but the real quantity of (item5) is 4 instead of 2.

temp = set(sessions_bought_items)
    dic ={}
    for j in temp:
        dic[j] = sessions_bought_items.count(j)
        df = pd.DataFrame(dic,index = [user],columns = [dic_keys for dic_keys in dic.keys()])

I also tried, pivot_table(values=quantity, index=user, columns=items), but it takes one of repeated value of (item4) 3 instead of 6. The problem in counting the final quantity of items for each user.

nucsit026
  • 652
  • 7
  • 16

0 Answers0