0

I have a dataframe as follows:

imagename   date                 seqid             locid
image1.jpg  16-05-2019 19:08:16  [7, 23, 29]        vp1
image2.jpg  16-05-2019 19:08:17  [15, 23, 48,3798]  vp1

The column seqid contains arrays with differential length. I want to split the array and for each item in the array I want to create a new row retaining one value from the array and all other values. The desired output is as follows:

imagename   date                 seqid  locid
image1.jpg  16-05-2019 19:08:16  7      vp1
image1.jpg  16-05-2019 19:08:16  23     vp1
image1.jpg  16-05-2019 19:08:16  29     vp1
image2.jpg  16-05-2019 19:08:17  15     vp1
image2.jpg  16-05-2019 19:08:17  23     vp1
image2.jpg  16-05-2019 19:08:17  48     vp1
image2.jpg  16-05-2019 19:08:17  3798   vp1

The input file is in csv format. I understand I could split the array into multiple columns using

df.seqid.tolist(), columns=['col1', 'col2']

by reading the csv as pd.DataFrame however, I am not sure how to go about when I don't know the length of the array in the column.

I just couldn't figure out how to do this.

petezurich
  • 9,280
  • 9
  • 43
  • 57
Apricot
  • 2,925
  • 5
  • 42
  • 88
  • If I understand this correctly, I think this [post](https://medium.com/@sureshssarda/pandas-splitting-exploding-a-column-into-multiple-rows-b1b1d59ea12e) by Suresh Sarda can help – Buckeye14Guy Jun 15 '19 at 14:04
  • 1
    @Buckeye14Guy Thanks a ton...the post really helped me...I can accept this as answer – Apricot Jun 15 '19 at 14:11
  • I agree this question may be a duplicate...however, after searching for more than couple of hours I still could not find this question. besides, I believe the post by Suresh Sarda is much more intuitive and easy to understand for novices like me. – Apricot Jun 15 '19 at 14:14
  • @Apricot No problem, you now have more options to choose from. :) Cheers..!! – anky Jun 15 '19 at 14:26

0 Answers0