1

Possible Duplicate:
easiest way to parse xml in python

I need to parse a file which looks like an xml file but have no XML declaration.

Here is an example of XML file:

<connection name="name_1">
  <parameter name="user" value="user_value_1"/>
  <parameter name="password" value="psw_1"/>
</connection>
<connection name="name_2">
  <parameter name="user" value="user_value_2"/>
  <parameter name="password" value="psw_2"/>
</connection>

<connection name="name_n">
  <parameter name="user" value="user_value_n"/>
  <parameter name="password" value="psw_n"/>
</connection>

My question is, which libraries can I use to parse the current file?

In practice, given the current file how can obtain the output:

{"connection names":["name_1","name_2",…,"name_n"]}

Thanks,

Antonio

Community
  • 1
  • 1
antonjs
  • 14,060
  • 14
  • 65
  • 91
  • What have you tried so far? Using a XML is your friend...what is the specific problem here? –  May 20 '11 at 15:33

1 Answers1

2

Your XML is invalid and won't be parsed properly, as it doesn't contain a main element. Here's a valid version:

<connections>
  <connection name="name_1">
    <parameter name="user" value="user_value_1"/>
    <parameter name="password" value="psw_1"/>
  </connection>

  <connection name="name_2">
    <parameter name="user" value="user_value_2"/>
    <parameter name="password" value="psw_2"/>
  </connection>

  <connection name="name_n">
    <parameter name="user" value="user_value_n"/>
    <parameter name="password" value="psw_n"/>
  </connection>
</connections>

You can use minidom to parse it. Yes it's kinda slow for lots of elements, but I can't help but use something which feels just like JavaScript:

from xml.dom.minidom import parseString

document = parseString('''
<?xml version="1.0"?>
<connections>
  <connection name="name_1">
    <parameter name="user" value="user_value_1"/>
    <parameter name="password" value="psw_1"/>
  </connection>

  <connection name="name_2">
    <parameter name="user" value="user_value_2"/>
    <parameter name="password" value="psw_2"/>
  </connection>

  <connection name="name_n">
    <parameter name="user" value="user_value_n"/>
    <parameter name="password" value="psw_n"/>
  </connection>
</connections>''')

names = {'connection names': []}

for connection in document.getElementsByTagName('connection'):
  names['connection names'].append(connection.getAttribute('name'))

print names

And the output is:

 {'connection names': [u'name_1', u'name_2', u'name_n']}
Blender
  • 289,723
  • 53
  • 439
  • 496