5

I have below format of dataframe

student   marks
a         [12,12,34]
b         [34,35]
c         [23,45,23]

i want it to convert to like below

student marks_1  marks_2 marks_3
a       12        12      34
b       34        35      Nan
c       23        45      23

how to achieve this ? any help please

Bravo
  • 8,589
  • 14
  • 48
  • 85

2 Answers2

5

Use join new DataFrame created by extracted column marks by pop, convert to lists and use rename for columns by custom lambda function:

f = lambda x: 'marks_{}'.format(x + 1)
df = df.join(pd.DataFrame(df.pop('marks').values.tolist()).rename(columns=f))
print (df)
  student  marks_1  marks_2  marks_3
0       a       12       12     34.0
1       b       34       35      NaN
2       c       23       45     23.0

Detail:

print (pd.DataFrame(df.pop('marks').values.tolist()))
    0   1     2
0  12  12  34.0
1  34  35   NaN
2  23  45  23.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 2
    As far as I can tell, the important step here is `new_df = pd.DataFrame(df['marks'].tolist())`. The `pop` call just deletes the old column after you use it, and I think you can skip the `.values` step. Thanks for this answer! – user7868 Jul 08 '22 at 01:29
1

Try

dfr = pd.concat([df.student, df.marks.apply(lambda el: pd.Series(
    el, index=['marks_{}'.format(i + 1) for i in range(len(el))]))], axis=1)

The code above create an index from each element in marks then concatenate the result with the student column,

The output:

dfr

  student  marks 1  marks 2  marks 3
0       a     12.0     12.0     34.0
1       b     34.0     35.0      NaN
2       c     23.0     45.0     23.0
sgDysregulation
  • 4,309
  • 2
  • 23
  • 31
  • Hmmm, I think converting to `Series` is really slow - check [this](https://stackoverflow.com/a/35491399/2901002) – jezrael Feb 12 '18 at 11:39