i want to process an very big XML file (> 3 Gigabyte) with python3, but the problem is that the xml file is incomplete like this :
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<neighbor name="Austria" direction="E"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank>68</rank>
the result that i'm looking for is this :
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<neighbor name="Austria" direction="E"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<neighbor name="Malaysia" direction="N"/>
</country>
</data>
So, i have to add the header part (showed below) to the XML file :
<?xml version="1.0"?>
<data>
then, delete the incomplete part (showed below) of the xml file :
<country name="Panama">
<rank>68</rank>
and finally, add the queue part (showed below) to the XML file :
</data>
ALL these process must be done by a Python script.
Thank for your help.