0

I am trying to edit a xml file. I am using the xml.etree library.

My xml

<ext:UBLExtensions>
    <ext:UBLExtension>
        <ext:ExtensionContent>
        </ext:ExtensionContent>
    </ext:UBLExtension>
</ext:UBLExtensions>

my python code

import xml.etree.ElementTree as gfg

tree = gfg.parse('file_name.xml')
root = tree.getroot()
tree.write("file_name.xml")

i haven't change anything but my xml become this.

<ns1:UBLExtensions>
    <ns1:UBLExtension>
        <ns1:ExtensionContent>
        </ns1:ExtensionContent>
    </ns1:UBLExtension>
</ns1:UBLExtensions>

why my header is change ? How can i avoid this ?

Selman
  • 274
  • 2
  • 4
  • 17

1 Answers1

1

The two documents you've posted are identical, as long as the namespace prefix maps to the same namespace. When you have something like this:

<document xmlns:doc="http://example.com/document/v1.0">
  <doc:title>An example</title>
</document>

Then that <doc:title> element means <title> in the http://example.com/document/v1.0` namespace". When you parse the document, your XML parser doesn't particularly care about the prefix, and it will generate a new prefix when writing out the document...

...unless you configure an explicit prefix mapping, which we can do with the register_namespace method. For example:

import xml.etree.ElementTree as etree

etree.register_namespace("ext", "http://example.com/extensions")

tree = etree.parse("data.xml")
tree.write("out.xml")

If data.xml contains:

<example xmlns:ext="http://example.com/extensions">
  <ext:UBLExtensions>
    <ext:UBLExtension>
      <ext:ExtensionContent>
      </ext:ExtensionContent>
    </ext:UBLExtension>
  </ext:UBLExtensions>
</example>

Then the above code will output:

<example xmlns:ext="http://example.com/extensions">
  <ext:UBLExtensions>
    <ext:UBLExtension>
      <ext:ExtensionContent>
      </ext:ExtensionContent>
    </ext:UBLExtension>
  </ext:UBLExtensions>
</example>

Without the call to etree.register_namespace; the output looks like:

<example xmlns:ns0="http://example.com/extensions">
  <ns0:UBLExtensions>
    <ns0:UBLExtension>
      <ns0:ExtensionContent>
      </ns0:ExtensionContent>
    </ns0:UBLExtension>
  </ns0:UBLExtensions>
</example>

It's the same document, and the elements are all still in the same namespace; we're just using a different prefix as the short name of the namespace.

larsks
  • 277,717
  • 41
  • 399
  • 399
  • Thank you very much. This is a great answer. Quick question: \n İf i have multiple namespaces inside of my xml, can i set them too ? – Selman Sep 23 '22 at 14:36
  • 1
    Yes, you can make multiple calls to `register_namespace`, and it looks like the `write` method takes a `default_namespace` parameter. – larsks Sep 23 '22 at 14:39
  • I'm using these works for e-billing notification to the government and I wanted to do the same with the namespaces in the sample xml I have. But I guess it doesn't matter what the namespaces are. This is purely done to make the xml work properly. It doesn't need to be a standard. That's how I got it, do you agree? – Selman Sep 23 '22 at 14:43
  • 1
    That is technically true (well, it doesn't matter what the namespace **prefixes** are. The namespaces **absolutely** matter), and if the XML is solely for machine consumption it's fine. But if the XML is going to be read by people, it makes sense to use the expected namespace prefixes...and since it turns out to be quite easy, you might as well just do it. – larsks Sep 23 '22 at 14:49
  • You've been very, very helpful and I've helped a lot. I have a question. I want to make xml utf-8. I couldn't add this. While saving the file i tried to with open(fileName, "wb") as files: tree.write(files) but its returning error ValueError: binary mode doesn't take an encoding argument. How can i make this any idea ? – Selman Sep 24 '22 at 16:23
  • 1
    https://stackoverflow.com/questions/15356641/how-to-write-xml-declaration-using-xml-etree-elementtree – larsks Sep 25 '22 at 11:56