3

Suppose I have an XML string:

<A>
    <B foo="123">
        <C>thing</C>
        <D>stuff</D>
    </B>
</A>

and I want to insert a namespace of the type used by XML Schema, putting a prefix in front of all the element names.

<A xmlns:ns1="www.example.com">
    <ns1:B foo="123">
        <ns1:C>thing</ns1:C>
        <ns1:D>stuff</ns1:D>
    </ns1:B>
</A>

Is there a way to do this (aside from brute-force find-replace or regex) using lxml.etree or a similar library?

Eli Rose
  • 6,788
  • 8
  • 35
  • 55
  • There is no prefix on the `A` element in the wanted output. Typo? – mzjn Aug 06 '15 at 20:19
  • @mzjn: Is there supposed to be a namespace prefix on the root element too? The code I'm working on has none (and does not complain) but I could definitely believe it. – Eli Rose Aug 06 '15 at 20:32
  • 1
    You said "putting a prefix in front of **all** the element names", so I had to ask. If you don't want a prefix on `A`, that's fine. – mzjn Aug 06 '15 at 20:36

3 Answers3

4

I don't think this can be done with just ElementTree.

Manipulating namespaces is sometimes surprisingly tricky. There are many questions about it here on SO. Even with the more advanced lxml library, it can be really hard. See these related questions:

Below is a solution that uses XSLT.

Code:

from lxml import etree

XML = '''
<A>
    <B foo="123">
        <C>thing</C>
        <D>stuff</D>
    </B>
</A>'''

XSLT = '''
<xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:ns1="www.example.com">
 <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

  <xsl:template match="*">
   <xsl:element name="ns1:{name()}">
    <xsl:apply-templates select="node()|@*"/>
   </xsl:element>
  </xsl:template>

  <!-- No prefix on the A element -->
  <xsl:template match="A">
   <A xmlns:ns1="www.example.com">
    <xsl:apply-templates select="node()|@*"/>
   </A>
  </xsl:template>
</xsl:stylesheet>'''

xml_doc = etree.fromstring(XML)
xslt_doc = etree.fromstring(XSLT)
transform = etree.XSLT(xslt_doc)
print transform(xml_doc)

Output:

<A xmlns:ns1="www.example.com">
    <ns1:B foo="123">
        <ns1:C>thing</ns1:C>
        <ns1:D>stuff</ns1:D>
    </ns1:B>
</A>
Community
  • 1
  • 1
mzjn
  • 48,958
  • 13
  • 128
  • 248
1

Use ET.register_namespace('ns1', 'www.example.com') to register the namespace with ElementTree. This is needed so write() uses the registered prefix. (I have code that uses a prefix of '' (an empty string) for the default namespace)

Then prefix each element name with {www.example.com}. For example: root.find('{www.example.com}B').

dsh
  • 12,037
  • 3
  • 33
  • 51
  • 1
    I think the OP wants to convert a bare `` element into an `` element. I don't think this is going to be possible without iteratively creating new elements and copying over content and attributes from the original tree. – larsks Aug 06 '15 at 20:18
  • The first part sounds promising, but is there a way to prefix each element name with `{www.example.com}` without walking the tree and adding it manually? – Eli Rose Aug 07 '15 at 11:35
1
import xml.etree.ElementTree as ET

name_space = {
# namespace defined below
"xmlns:ns1":"www.example.com""="www.example.com"
}

A = ET.Element('A', name_space)
B = ET.SubElement(A, 'ns1:B')
C = ET.SubElement(B, 'ns1:C')
C.text = 'thing'

You can pass the namespaces to the Element constructor, at that point you can reference if at any child component. note you can define more than one. this solution worked for me.