Turn similar rows into columns in Pandas

Question

I have the following df

df = pd.DataFrame({'doc':['john','john','john', 'mary', 'mary', 'mary'], 'token':[1,2,3,4,5,6,]})

How do I turn it into:

df = pd.DataFrame({'john':[1,2,3],'mary':[4,5,6]})

I've tried pivot, pivot_table, stack, and unstack but had no success.

Corralien · Answer 1 · 2023-01-06T20:21:26.913

3

Use groupby to create a dummy index then use pivot to get the expected dataframe:

>>> (df.assign(index=df.groupby('doc').cumcount())
       .pivot(index='index', columns='doc', values='token')
       .rename_axis(index=None, columns=None))

   john  mary
0     1     4
1     2     5
2     3     6

Update: Suggested by @Chrysophylaxs using pivot_table:

>>> (df.pivot_table(columns="doc", index=df.groupby("doc").cumcount(), values="token")
       .rename_axis(columns=None))

   john  mary
0     1     4
1     2     5
2     3     6

edited Jan 06 '23 at 20:21

answered Jan 06 '23 at 19:55

Corralien

109,409
8
28
52

thanks. still had to do some more things: pd.DataFrame(list(df.groupby('doc').agg(list).to_dict().values())[0]) – brazilian_student Jan 06 '23 at 20:10
What about `df.pivot_table(columns="doc", index=df.groupby("doc").cumcount(), values="token")`? – Chrysophylaxs Jan 06 '23 at 20:17
1

@Chrysophylaxs. You are right. I always forget it's possible to add a dynamic index/column to `pivot_table`. – Corralien Jan 06 '23 at 20:19
1

@Chrysophylaxs. I updated my answer according your comment and gave you the credit – Corralien Jan 06 '23 at 20:22
Appreciate it @Corralien, was playing around with stack and unstack but could not get it to work until I saw your initial suggestion :) – Chrysophylaxs Jan 06 '23 at 20:23

score 0 · Answer 2 · answered Jan 06 '23 at 19:59

0

You could do this:

df_cols = pd.DataFrame({k: v.reset_index(drop=True) for k, v in df.groupby('doc')['token']})

Output:

   john  mary
0     1     4
1     2     5
2     3     6

answered Jan 06 '23 at 19:59

Nick ODell

15,465
3
32
66

Turn similar rows into columns in Pandas

2 Answers2