1

I have a problem with a python script which is used to parse a xml file. This is the xml file:

file.xml

<Tag1 SchemaVersion="1.1" xmlns="http://www.microsoft.com/axe">
    <RandomTag>TextText</RandomTag>
    <Tag2 xmlns="http://schemas.datacontract.org/2004/07">
         <AnotherRandom>Abc</AnotherRandom>
    </Tag2>
</Tag1>

I am using xml.etree.ElementTree as parsing method. My task is to change the tags between RandomTag (in this case "TextText"). This is the python code:

python code

import xml.etree.ElementTree as ET

customXmlFile = 'file.xml'

ns = {
'ns': 'http://www.microsoft.com/axe',
'sc': 'http://schemas.datacontract.org/2004/07/Microsoft.Assessments.Relax.ObjectModel_V1'
}
tree = ET.parse(customXmlFile)
root = tree.getroot()
node = root.find('ns:RandomTag', namespaces=ns)
node.text = 'NEW TEXT'
ET.register_namespace('', 'http://www.microsoft.com/axe')

tree.write(customXmlFile + ".new",
xml_declaration=True,
encoding='utf-8',
method="xml")

I don't have run time errors, the code works fine, but all the namespaces are defined in the first node (Tag1) and in AnotherRandom and Tag2 is used a shorcut. Also, the SchemaVersion is deleted.

file.xml.new - output

<?xml version='1.0' encoding='utf-8'?>
<Tag1 xmlns="http://www.microsoft.com/axe" xmlns:ns1="http://schemas.datacontract.org/2004/07" SchemaVersion="1.1">
      <RandomTag>NEW TEXT</RandomTag>
      <ns1:Tag2>
             <ns1:AnotherRandom>Abc</ns1:AnotherRandom>
      </ns1:Tag2>
</Tag1>

file.xml.new - desired output

<Tag1 SchemaVersion="1.1" xmlns="http://www.microsoft.com/axe">
    <RandomTag>TextText</RandomTag>
    <Tag2 xmlns="http://schemas.datacontract.org/2004/07">
         <AnotherRandom>NEW TEXT</AnotherRandom>
    </Tag2>
</Tag1>

What should I change to get exact the same format of XML as at the beggining with that only text changed?

Martin Rezyne
  • 445
  • 3
  • 9
  • 24
  • Your xml file has some issues as does your code. According to your code it outputs some code. If you could fix the typos that would help us diagnose the problem. Please post the complete working code. For instance, your ns dictionary should be using colons and not equal signs. As well the closing Tag1 should have a forward slash etc. – William Denman Sep 15 '14 at 10:29
  • I fixed those 2 problems. I can't copy the entire xml code because it is a big one. Mainly, the structure is the same as this one and the python code is as shown in the question. – Martin Rezyne Sep 15 '14 at 10:34
  • I also believe it should be namespaces not namespace in the find() call, are you sure there are no more typos? What about your imports etc? You really need to ensure that a copy/paste should be working code. As well, what version of Python are you using? – William Denman Sep 15 '14 at 10:37
  • I am using python 2.7. – Martin Rezyne Sep 15 '14 at 11:10
  • I have edited the code. I execute it, and the output is the same as the one from the quention. – Martin Rezyne Sep 15 '14 at 11:11

1 Answers1

0

This is a bit of a hack but will do kind of what you want. However, playing around with namespaces like this surely violates the XML standard. I suggest you check out lxml if you want better handling of namespaces.

You must call register_namespace() before parsing in the file. Since repeated calls to that function overwrite previous mapping, you must manually edit the internal dict.

import xml.etree.ElementTree as ET

customXmlFile = 'test.xml'

ns = {'ns': 'http://www.microsoft.com/axe',
      'sc': 'http://schemas.datacontract.org/2004/07/'}

ET.register_namespace('', 'http://www.microsoft.com/axe')
ET._namespace_map['http://schemas.datacontract.org/2004/07'] = ''

tree = ET.parse(customXmlFile)
root = tree.getroot()
node = root.find('ns:RandomTag', namespaces=ns)
node.text = 'NEW TEXT'

tree.write(customXmlFile + ".new",
       xml_declaration=True,
       encoding='utf-8',
       method="xml")

For more information about this see:

http://effbot.org/zone/element-namespaces.htm

Saving XML files using ElementTree

Cannot write XML file with default namespace

Community
  • 1
  • 1
William Denman
  • 3,046
  • 32
  • 34
  • Thank you for your answer. I have tried your solution, but the output it isn't the one i want. In deed, I don't have anymore those shorcuts 'ns1' but all the namespaces are defined in the first tag. I need to it to be defined exactly where it was. I've done some research on the Internet, but I could found nothing. In the end, I was parsing the xml file like a text file and change the desired value using functions specific to strings. – Martin Rezyne Sep 16 '14 at 08:31
  • I realise it was not exactly what you wanted. Glad you found something that worked. – William Denman Sep 16 '14 at 08:56