I have a dataframe like this:
customer_id | date | category
1 | 2017-2-1 | toys
2 | 2017-2-1 | food
1 | 2017-2-1 | drinks
3 | 2017-2-2 | computer
2 | 2017-2-1 | toys
And I want to convert this dataframe into this:
customer_id | toys | food | drinks | computer
1 | 1 | 0 | 1 | 0
2 | 1 | 1 | 0 | 0
3 | 0 | 0 | 0 | 1
I want to group by customer_id and date, one hot encoding the categories to show if that customer purchased things in those categories in the same day.
I know the groupby()
method and I tried with df.groupby(['customer_id', 'date'])
but that doesn't seem to work and I can't figure out how to make values in 'category'
my new columns.
I've looked at the post about pivot_table()
, but I can't find any information on bounding each row to be within the same day
Thank you.