1

I have a dataframe like this:

    NUMBER          NAME
1   000231          John Stockton
2   009456          Karl Malone
3   100000901       John Stockton
4   100008496       Karl Malone

I want to obtain a new dataframe with:

    NAME            VALUE1  VALUE2
1   John Stockton   000231  100000901
2   Karl Malone     009456  100008496

I think I should use pd.groupby(), but I have no function to pass as an aggregator (I don't need to compute any mean(), min(), or max() value). If I just use pd.groupby() without any aggregator, I get:

In[1]: pd.DataFrame(df.groupby(['NAME']))

Out[1]: 
           0               1
        0  John Stockton   NAME NUMBER 000231 100000901
        1  Karl Malone     NAME NUMBER 009456 100008496

What am I doing wrong? Do I need to pivot the dataframe?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Zizzipupp
  • 1,301
  • 1
  • 11
  • 27

1 Answers1

1

Actually, you need a bit mode complicated pipeline:

(df.assign(group=df.groupby('NAME').cumcount().add(1)
   .pivot(index='NAME', columns='group', values='NUMBER')
   .rename_axis(None, axis=1)
   .add_prefix('VALUE')
   .reset_index()
)

output:

            NAME  VALUE1     VALUE2
0  John Stockton     231  100000901
1    Karl Malone    9456  100008496
mozway
  • 194,879
  • 13
  • 39
  • 75