19

I'm writing a Python script to update Visual Studio project files. They look like this:

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" DefaultTargets="Build" 
      xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
      ...

The following code reads and then writes the file:

import xml.etree.ElementTree as ET

tree = ET.parse(projectFile)
root = tree.getroot()
tree.write(projectFile,
           xml_declaration = True,
           encoding = 'utf-8',
           method = 'xml',
           default_namespace = "http://schemas.microsoft.com/developer/msbuild/2003")

Python throws an error at the last line, saying:

ValueError: cannot use non-qualified names with default_namespace option

This is surprising since I'm just reading and writing, with no editing in between. Visual Studio refuses to load XML files without a default namespace, so omitting it is not optional.

Why does this error occur? Suggestions or alternatives welcome.

Andomar
  • 232,371
  • 49
  • 380
  • 404

3 Answers3

44

This is a duplicate to Saving XML files using ElementTree

The solution is to define your default namespace BEFORE parsing the project file.

ET.register_namespace('',"http://schemas.microsoft.com/developer/msbuild/2003")

Then write out your file as

tree.write(projectFile,
           xml_declaration = True,
           encoding = 'utf-8',
           method = 'xml')

You have successfully round-tripped your file. And avoided the creation of ns0 tags everywhere.

Community
  • 1
  • 1
WombatPM
  • 2,561
  • 2
  • 22
  • 22
  • 2
    This approach works. However, `find('mytag', ns)` and `findall('mytag', ns)` methods fail (they return an empty list of elements). It seems that they require a *non-empty* namespace name. Which is fine, unless you want to `write()` the XML file with an empty namespace prefix for elements in the default namespace. (Using Python 2.7.) – peter.slizik May 30 '19 at 10:15
4

I think that lxml does a better job handling namespaces. It aims for an ElementTree-like interface but uses xmllib2 underneath.

>>> import lxml.etree
>>> doc=lxml.etree.fromstring("""<?xml version="1.0" encoding="utf-8"?>
... <Project ToolsVersion="4.0" DefaultTargets="Build" 
...       xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
...   <PropertyGroup>
...   </PropertyGroup>
... </Project>""")

>>> print lxml.etree.tostring(doc, xml_declaration=True, encoding='utf-8', method='xml', pretty_print=True)
<?xml version='1.0' encoding='utf-8'?>
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0" DefaultTargets="Build">
  <PropertyGroup>
  </PropertyGroup>
</Project>
tdelaney
  • 73,364
  • 6
  • 83
  • 116
  • +1 Though it looks like `lxml.etree` does not come with Python for Windows, so I'll accept WombatPM's answer – Andomar Aug 20 '13 at 18:09
  • @WombatPM's answer is great too but lxml is available for windows. You can use pip, easy_install and (I think) Active State's pypm... or just grab it from (pypi)[https://pypi.python.org/pypi/lxml/3.2.3]. – tdelaney Aug 20 '13 at 18:19
  • I've used lxml as well. Both are good. – WombatPM Aug 20 '13 at 18:30
0

This was the closest answer I could find to my problem. Putting the:

ET.register_namespace('',"http://schemas.microsoft.com/developer/msbuild/2003")

just before the parsing of my file did not work.

You need to find the specific namespace the xml file you are loading is using. To do that, I printed out the Element of the ET tree node's tag which gave me my namespace to use and the tag name, copy that namespace into:

ET.register_namespace('',"XXXXX YOUR NAMESPACEXXXXXX")

before you start parsing your file then that should remove all the namespaces when you write.

Andomar
  • 232,371
  • 49
  • 380
  • 404