How to pivot pandas dataframe on specific rows

Question

I have the following dataframe:

df = pd.DataFrame({'recipe': ['a', 'a', 'a', 'a', 'a','a', 'b', 'b', 'b', 'b', 'b','b'],
                   'product': ['A', 'B', 'C', 'D', 'E', 'F', 'A', 'B', 'G', 'I', 'K', 'F']})

And I want to pivot the dataframe for the output to have one row for each recipe and multiple columns for each product, ex:

recipe  p1  p2  p3  p4  p5  p6
a       A   B   C   D   E   F
b       A   B   G   I   K   F

Also, there are multiple products, but I want the top 6 based on another column that gives a ranking for how relevant the product is

@KannanKandasamy I tried using pivot, but it doesn't give me what I am looking for ... I'm playing around using metl and pivot but not luck yet — Maria Nazari, Jan 30 '19 at 21:01
Maybe you will find something like this of help: ```pd.get_dummies(df, columns=["product"]).groupby("recipe").sum(min_count=1)``` — gorjan, Jan 30 '19 at 21:03
@gorjan your code is interesting, but I have thousands of products names, and only want the top 10. Your method will not work. I also want the columns to be labeled product 1 - 10 and with the actual product names in the cells. — Maria Nazari, Jan 30 '19 at 21:10
@gorjan so I will have 100s of products matched to each recipe, and I only want the top n of these based on their another column called cos_score. I don't want 100s of product columns, I just want the top n products for each recipe. — Maria Nazari, Jan 30 '19 at 21:28

score 4 · Accepted Answer · answered Jan 30 '19 at 21:38

This is actually a groupby and unnesting problem.

u = df.groupby('recipe')['product'].apply(list)
pd.DataFrame(u.tolist(), index=u.index).rename(lambda x: 'p{}'.format(x+1), axis=1)

       p1 p2 p3 p4 p5 p6
recipe                  
a       A  B  C  D  E  F
b       A  B  G  I  K  F

How to pivot pandas dataframe on specific rows

1 Answers1