1

so i am trying to load this vasprun.xml file in pandas datafram like

df  = pd.read_xml("vasprun.xml")

but i am constantly getting this error i have about 900 files like this and i really need to load this data in pandas this is full error that i get

Traceback (most recent call last):
  File "/home/aneeq/Documents/Internship/Python-learner/vasp_read.py", li
ne 4, in <module>
    df  = pd.read_xml("vasprun.xml", attrs_only=True)
  File "/home/aneeq/.local/lib/python3.10/site-packages/pandas/util/_deco
rators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/aneeq/.local/lib/python3.10/site-packages/pandas/io/xml.py"
, line 938, in read_xml
    return _parse(
  File "/home/aneeq/.local/lib/python3.10/site-packages/pandas/io/xml.py"
, line 733, in _parse
    data_dicts = p.parse_data()
  File "/home/aneeq/.local/lib/python3.10/site-packages/pandas/io/xml.py"
, line 389, in parse_data
    self.xml_doc = XML(self._parse_doc(self.path_or_buffer))
  File "/home/aneeq/.local/lib/python3.10/site-packages/pandas/io/xml.py"
, line 554, in _parse_doc
    doc = fromstring(
  File "src/lxml/etree.pyx", line 3254, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1913, in lxml.etree._parseMemoryDocume
nt
  File "src/lxml/parser.pxi", line 1800, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1141, in lxml.etree._BaseParser._parse
Doc
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._han
dleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "<string>", line 88
lxml.etree.XMLSyntaxError: Extra content at the end of the document, line
 88, column 2
azad11
  • 11
  • 2
  • That file is invalid XML. – LMC Aug 09 '22 at 21:40
  • 1
    The file is not [*well-formed*, which is technically different than *invalid*](https://stackoverflow.com/q/134494/290085). See duplicate link that lists the most common mistakes that would cause an "extra content at the end of the document" error. In this case, there is additional markup after the root `atominfo` element, but there can only be a ***single root element*** in well-formed XML. – kjhughes Aug 09 '22 at 22:53
  • yeah i have read other threads related to this but given that i have got this data in this form and i have large ammount of files in this form, i cant manually edit all of them one by one , what should i do – azad11 Aug 11 '22 at 18:11

0 Answers0