0

I am reading a csv file into a pandas dataframe and the data inside dataframe is as below:

    item        seq_no      db_xml
0   28799179    5       ['<my_xml>....</my_xml>']
1   28839888    1       ['<my_xml>....</my_xml>']
2   28840113    75      ['<my_xml>....</my_xml>']
3   28852466    20,22   ['<my_xml1>....</my_xml1>', '<my_xml2>....</my_xml2>']

I need to convert above dataframe as below i.e. each seq_no for same item and its db_xml should be in different rows. I need to unmerge seq_no of same item in subsequent rows.

    item        seq_no      db_xml
0   28799179    5       ['<my_xml>....</my_xml>']
1   28839888    1       ['<my_xml>....</my_xml>']
2   28840113    75      ['<my_xml>....</my_xml>']
3   28852466    20      ['<my_xml1>....</my_xml1>']
4   28852466    22      ['<my_xml2>....</my_xml2>']

Please let me know on how to achieve the same in pandas so that even seq_no is also split and in separate rows?

Shankar Guru
  • 1,071
  • 2
  • 26
  • 46
  • 1
    Do you need `df = df.explode('db_xml')` ? – jezrael Oct 07 '20 at 11:51
  • 1
    @jezrael: Thanks it works for the column db_xml but seq_no is set to 20,22 in both columns. How can I set it to 20 for first column and 22 for second column? – Shankar Guru Oct 07 '20 at 12:45
  • 1
    I see it, you can create list in `seq_no` column by `df.seq_no = df.seq_no.str.split(',')` and then use solution form linked second dupe answer. – jezrael Oct 07 '20 at 12:50

0 Answers0