I am working with a Dataframe that has several columns that are XML strings. The XML string has the following structure.
<row>
<group>
<date>2021-12-25</date>
<ind1>50</ind1>
...
<indN>10</indN>
</group>
...
<group>
<date>2021-12-31</date>
<ind1>28</ind1>
...
<indN>13</indN>
</group>
</row>
The dataframe has the following structure:
id name indicators
1 John '<?xml version="1.0"?>\n<row>\n <group>\n <date>2021-12-25</date> ...'
1 John '<?xml version="1.0"?>\n<row>\n <group>\n <date>2021-12-31</date> ...'
...
I want to explode the XML string so that it creates a column for each indicator. I want something that looks like the Dataframe below.
id name date ind1 ... indN
1 John 2021-12-25 50 ... 10
1 John 2021-12-31 28 ... 13
...
This is what the dataframe looks like
df.head(2).to_dict()
>> {'KEY': {0: 1,
1: 2},
'INDICATORS': {0: '<?xml version="1.0"?>\n<row>\n <group>\n <date>2017-12-31</date>\n <ind1>14</ind1>\n <ind2>24</ind2>\n </group>\n <group>\n <date>2015-12-31</date>\n <ind1>10</ind1>\n <ind2>20</ind2>\n </group>\n <group>\n <date>2016-12-31</date>\n <ind1>12</ind1>\n <ind2>22</ind2>\n </group>\n</row>\n',
1: '<?xml version="1.0"?>\n<row>\n <group>\n <date>2017-12-31</date>\n <ind1>0</ind1>\n <ind2>0</ind2>\n </group>\n <group>\n <date>2015-12-31</date>\n <ind1>0</ind1>\n <ind2>0</ind2>\n </group>\n <group>\n <date>2016-12-31</date>\n <ind1>0</ind1>\n <ind2>0</ind2>\n </group>\n</row>\n'}}
Do you have any idea?