I have a problem. In my csv file there is column in XML, like this:
ID Name Request
4223 Axery <Type xmlns="http://data"
xmlns:i="http://www.rij3.instance"><Person><City>
<Nr>5050</Nr><Description>Big</Description>
<Date>2012-10-30T00:00:00Z</Date></City><Details><Name>London</Name>
<Account>5050</Account><Date>2019-07-07T00:00:00Z</Date>
<Status>Open</Status></Details><..............[more info]>
....
</Person></Type>
4001 Jix <Type xmlns="http://data"
xmlns:i="http://www.rij3.instance"><Person><City>
<Nr>5024</Nr><Description>Big</Description>
<Date>2012-10-30T00:00:00Z</Date></City><Details><Name>London</Name>
<Account>5024</Account><Date>2019-07-07T00:00:00Z</Date>
<Status>Open</Status></Details><..............[more info]>
....
</Person></Type>
....
4067 AOe <Type xmlns="http://data"
xmlns:i="http://www.rij3.instance"><Person><City>
<Nr>5011</Nr><Description>Big</Description>
<Date>2012-10-30T00:00:00Z</Date></City><Details><Name>London</Name>
<Account>5011</Account><Date>2019-07-07T00:00:00Z</Date>
<Status>Open</Status></Details><..............[more info]>
....
</Person></Type>
I want extract XML info. I use Pandas to read my csv file
df = pd.read_csv('my_file.csv', header=0, sep='|', error_bad_lines=False)
I want a final df like this:
**ID Name Type Person City Nr Description Date ........**
4223 Axery 5050 Big 2012-10-30T00:00:00Z
Any suggestions? My idea was to work with only the XML columns and 'concat' the result.
Request:
<Type xmlns="http://data"
xmlns:i="http://www.rij3.instance">
<Person>
<City>
<Nr>5050</Nr>
<Description>Big</Description>
<Date>2012-10-30T00:00:00Z</Date>
</City>
<Details>
<Name>London</Name>
<Account>5050</Account>
<Date>2019-07-07T00:00:00Z</Date>
<Status>Open</Status>
</Details>
</Person>
</Type>