Separating data in a column by grouping

Question

I have a multiple column dataframe and want to separate data in a particular column by grouping them based on another column.

Here is an example:

ID     Name   Score
1      John   100
2      Lisa   80
3      David  75 
4      Lisa   92
5      John   89
6      Lisa   72

I would like my output to be like:

index  John  Lisa  David
0      100   80    75
1      89    92    NaN
2      NaN   72    NaN

I understand I can easily use the code:

df[df['Name'] == 'John]] and separate all the scores that John has in the dataframe, but since my dataframe is very large I would like to have a clean output like what I showed here.

Looks like a simple pivot (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html) — BallpointBen, Jan 08 '19 at 19:18

yatu · Answer 1 · 2019-01-08T19:25:29.727

1

You can use pivot_table. You can generate the index needed to pivot grouping the dataframe by Name and taking the cumcount:

ix = df.groupby('Name').cumcount()
pd.pivot_table(df,'Score', ix, 'Name')

Name  David   John  Lisa
1      75.0  100.0  80.0
2       NaN   89.0  92.0
3       NaN    NaN  72.0

edited Jan 08 '19 at 19:25

answered Jan 08 '19 at 19:19

yatu

86,083
12
84
139

Surprisingly for smaller DataFrames (a few thousand rows) `df.groupby('Name').Score.apply(list).apply(pd.Series).T` is faster. – ALollz Jan 08 '19 at 19:27
Wouldn't have said that piece of code works faster. Interesting to know :-) – yatu Jan 08 '19 at 19:29

Separating data in a column by grouping

1 Answers1