I have a dataframe that looks like this:
User | Product |
---|---|
1 | a |
1 | b |
2 | a |
2 | c |
3 | b |
I want 1 row per user with the products as columns where it gives a 1 or 0 if the user purchased the product or not, how can I do this?
I have a dataframe that looks like this:
User | Product |
---|---|
1 | a |
1 | b |
2 | a |
2 | c |
3 | b |
I want 1 row per user with the products as columns where it gives a 1 or 0 if the user purchased the product or not, how can I do this?
What you are looking for is "cross tabulation" or simply crosstab. Pandas has pd.crosstab
for the same.
pd.crosstab(df['User'], df['Product'])
Product a b c
User
1 1 1 0
2 1 0 1
3 0 1 0
df.pivot_table(index="User", columns="Product", aggfunc=len).fillna(0)
# Result:
Product a b c
User
1 1.0 1.0 0.0
2 1.0 0.0 1.0
3 0.0 1.0 0.0