How to transpose pandas data frame by a column and value?

Question

I have a dataframe like this

Input

student_id  rep
abc100      1   
abc101      2
abc102      1
abc103      2
abc104      1
abc105      2
abc106      1
abc107      2

Expected output

1       2
abc100  abc101
abc102  abc103
abc104  abc105
abc106  abc107

I tried

df = df.pivot( columns='rep', values='student_id')

but it contains lot of nans and didnt give expected output.

I searched in stackoverflow but couldnt find an answer.

as a general advice please provide sample in text, not images — Yuca, Dec 17 '18 at 16:12
@coldspeed that doesn't yield the desired output but maybe that's because of my index assumptios — Yuca, Dec 17 '18 at 16:16
@coldspeed your solution doesn't work and this is not a duplicate. I would suggest you reopen this question. — GeorgeOfTheRF, Dec 17 '18 at 16:19
@Yuca Perhaps The index should be the result of groupby and cumcount... hmm, yeah that might work. — cs95, Dec 17 '18 at 16:19
@GeorgeOfTheRF Gladly... just need to wait for OP to replace their images with text ;-) — cs95, Dec 17 '18 at 16:19
@coldspeed that's effectively what I suggested. It's scary how some solutions match exactly how others think, makes me feel good hehe — Yuca, Dec 17 '18 at 16:21

score 4 · Accepted Answer · answered Dec 17 '18 at 16:19

4

To match the exact desired output you could do

df['aux'] = df.groupby('rep').cumcount()
df.pivot(index='aux' ,columns='rep', values='student_id')

Output:

rep       1       2
aux                
0    abc100  abc101
1    abc102  abc103
2    abc104  abc105
3    abc106  abc107

answered Dec 17 '18 at 16:19

Yuca

6,010
3
22
42

1

Almost... I would've done `df.assign(index=df.groupby('rep').cumcount()).pivot('index', 'rep', 'student_id')` to avoid modifying the original, but this is effectively more efficient. +1 – cs95 Dec 17 '18 at 16:22
1

when people seem relatively new I prefer to provide the slow but readable solution. However, one liners are just too pretty and elegant – Yuca Dec 17 '18 at 16:23

Karn Kumar · Answer 2 · 2018-12-17T16:38:10.250

You can choose df by slicing the column using iloc and a step arg:

>>> pd.DataFrame({'student_id':df['student_id'].iloc[::2].values, 'student_id_1':df['student_id'].iloc[1::2].values})
  student_id student_id_1
0     abc100       abc101
1     abc102       abc103
2     abc104       abc105
3     abc106       abc107

OR , another way around as @coldspeed suggested just for the wide visibility :-)

df.assign(index=df.groupby('rep').cumcount()).pivot('index', 'rep', 'student_id')

How to transpose pandas data frame by a column and value?

2 Answers2

Linked