Python XML File Open

Question

I am trying to open an xml file and parse it, but when I try to open it the file never seems to open at all it just keeps running, any ideas?

from xml.dom import minidom
Test_file = open('C::/test_file.xml','r')
xmldoc = minidom.parse(Test_file)

Test_file.close()

for i in xmldoc:
     print('test')

The file is 180.288 KB, why does it never make it to the print portion?

Remove the XML stuff and check the file path by doing something like ``print Test_file`` or ``print Test_file.readline()``. — Hew Wolff, Sep 16 '13 at 18:15
Is the size of your XML document one hundred and eighty kilobytes or one hundred and eighty thousand kilobytes? (I'm spelling it out due to thousands separators being different in different cultures.) 180 kilobytes ought to be small enough for minidom to handle, but if you genuinely have 180 megabytes, avoid minidom, or any other XML DOM implementation for that matter, as DOM doesn't work at all well for documents of that size. Consider instead SAX or StAX. — Luke Woodward, Sep 16 '13 at 18:29
The `for` loop won't work. An `xml.dom.minidom.Document` instance is not an iterable sequence. — mzjn, Sep 16 '13 at 19:02

kjhughes · Accepted Answer · 2013-09-18T18:58:46.273

Running your Python code with a few adjustments:

from xml.dom import minidom
Test_file = open('C:/test_file.xml','r')
xmldoc = minidom.parse(Test_file)

Test_file.close()

def printNode(node):
  print node
  for child in node.childNodes:
       printNode(child)

printNode(xmldoc.documentElement)

With this sample input as test_file.xml:

<a>
  <b>testing 1</b>
  <c>testing 2</c>
</a>

Yields this output:

<DOM Element: a at 0xbc56e8>
<DOM Text node "u'\n  '">
<DOM Element: b at 0xbc5788>
<DOM Text node "u'testing 1'">
<DOM Text node "u'\n  '">
<DOM Element: c at 0xbc5828>
<DOM Text node "u'testing 2'">
<DOM Text node "u'\n'">

Notes:

As @LukeWoodward mentioned, avoid DOM-based libraries for large inputs, however 180K should be fine. For 180M, control may never return from minidom.parse() without running out of memory first (MemoryError).
As @alecxe mentioned, you should eliminate the extraneous ':' in the file spec. You should have seen error output along the lines of IOError: [Errno 22] invalid mode ('r') or filename: 'C::/test_file.xml'.
As @mzjn mentioned, xml.dom.minidom.Document is not iterable. You should have seen error output along the lines of TypeError: iteration over non-sequence.

Python XML File Open

1 Answers1

Linked