3

This is follow on question for Modify a XML using ElementTree

I am now having namespaces in my XML and tried understanding the answer at Parsing XML with namespace in Python via 'ElementTree' and have the following.

XML file.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
 <grandParent>
  <parent>
   <child>Sam/Astronaut</child>
  </parent>
 </grandParent>
</project>

My python code after looking at Parsing XML with namespace in Python via 'ElementTree'

import xml.etree.ElementTree as ET

spaces='xmlns':'http://maven.apache.org/POM/4.0.0','schemaLocation':'http://maven.apache.org/xsd/maven-4.0.0.xsd'}

tree = ET.parse("test.xml")
a=tree.find('parent')          
for b in a.findall('child', namespaces=spaces):
 if b.text.strip()=='Jay/Doctor':
    print "child exists"
    break
else:
    ET.SubElement(a,'child').text="Jay/Doctor"

tree.write("test.xml")

I get the error: AttributeError: 'NoneType' object has no attribute 'findall'

Community
  • 1
  • 1
nick01
  • 333
  • 1
  • 8
  • 19
  • Neither of the code snippets you posted is valid Python. There's stray bits of XML in the first, messed up indentation in both, and missing brackets in the second. – Lukas Graf Jul 31 '14 at 22:52
  • Yes my bad. I tried to correct it now. – nick01 Jul 31 '14 at 22:54
  • 1
    Aside: the indentation of `else` is incorrect. It wants to line up with `for`, not with `if`. – Robᵩ Jul 31 '14 at 23:17

1 Answers1

2

There are two problems on this line:

a=tree.find('parent')          

First, <parent> is not an immediate child of the root element. <parent> is a grandchild of the root element. The path to parent looks like /project/grandparent/parent. To search for <parent>, try the XPath expression */parent or possiblly //parent.

Second, <parent> exists in the default namespace, so you won't be able to .find() it with just its simple name. You'll need to add the namespace.

Here are two equally valid calls to tree.find(), each of which should find the <parent> node:

a=tree.find('*/{http://maven.apache.org/POM/4.0.0}parent')
a=tree.find('*/xmlns:parent', namespaces=spaces)

Next, the call to findall() needs a namespace qualifier:

for b in a.findall('xmlns:child', namespaces=spaces) 

Fourth, the call to create the new child element needs a namespace qualifier. There may be a way to use the shortcut name, but I couldn't find it. I had to use the long form of the name.

ET.SubElement(a,'{http://maven.apache.org/POM/4.0.0}child').text="Jay/Doctor"

Finally, your XML output will look ugly unless you provide a default namespace:

tree.write('test.xml', default_namespace=spaces['xmlns'])

Unrelated to the XML aspects, you copied my answer from the previous question incorrectly. The else lines up with the for, not with the if:

for ...
  if ...
else ...
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • I don't mind removing the project tag and add the name space to grandParent tag. – nick01 Jul 31 '14 at 23:25
  • Works. You saved my day. Working with namespaces was difficult. BTW How do I add a new line character after the sub element has been written/added? – nick01 Jul 31 '14 at 23:48
  • `newkid=ET.SubElement(...) ; newkid.text="Jay/Dr" ; newkid.tail="\n"` – Robᵩ Jul 31 '14 at 23:54
  • tree.write('test.xml', default_namespace=spaces['xmlns']) what if I can't provide default_namespace argument? any other way to make sure it isn't ugly? – nick01 Aug 01 '14 at 00:09
  • It won't be terribly ugly. No, I don't know of any other way. – Robᵩ Aug 01 '14 at 00:12
  • Add this anywhere before `tree.write()`: `ET.register_namespace('', 'http://maven.apache.org/POM/4.0.0')` – Robᵩ Aug 01 '14 at 00:20
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/58458/discussion-between-user2812714-and-rob). – nick01 Aug 01 '14 at 00:22