I am trying to find the most frequent combination of product names in a column for each transaction number. I have tried using groupby and dummy variables but can't seem to get too far with them. Here is a sample from the original dataframe:
Timestamp Transaction Number Product UPC Number Price Retailer
\
0 3/2/18 08:15:50 123098 111100001234 1.80 Fast Mart
1 3/2/18 08:15:50 123098 111100001235 23.45 Fast Mart
2 3/2/18 08:15:50 123098 111100001236 6.99 Fast Mart
3 3/2/18 08:19:52 123099 111100001236 6.99 Quick Stop
4 3/2/18 08:19:52 123099 111100001237 7.15 Quick Stop
Product Name
0 Coke 20 oz
1 Miller Lite 24 Pack
2 Pepsi 12 Pack
3 Pepsi 12 Pack
4 Coke 12 Pack
Right now all my code has led to dead ends. The question is: how can I find the most frequent pair of 'Product Name' values across each unique 'Transaction Number'?
Code:
dummy_df = pd.get_dummies(pos_df,columns = ['Product Name'])
dummy_df.groupby('Transaction Number').sum().head()
This leads me to a spot with data that I am not sure how to use. Would appreciate any insights.