I have a pandas dataframe df
, with the columns user
and product
. It describes which user buys which products, accounting for repeated purchases of the same product. E.g. if user 1 buys product 23 three times, df
will contain the entry 23 three times for user 1.
For every user, I am interested in only those products that are bought more than three times by that user. Hence, I do s = df.groupby('user').product.value_counts()
, and then I filter s = s[s>2]
, to discard the products not bought frequently enough. Then, s
looks something like this:
user product
3 39190 9
47766 8
21903 8
6 21903 5
38293 5
11 8309 7
27959 7
14947 5
35948 4
8670 4
Having filtered the data, I am not interested in the frequencies (the right column) any more.
How can I create a dict of the form user:product
based on s
? I have trouble accessing the individual columns/index of the Series.