2

We are processing an XML file format where the vendor changed the namespace of the elements between versions, but many of the elements and attributes are the same and we could re-use our small subset of XPath queries.

The old format looked like this:

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
... interesting stuff here ...
  </PropertyGroup>
</Project>

while the new format looks like this:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
... interesting stuff here ...
  </PropertyGroup>
</Project>

Our existing code does this:

        XmlDocument doc = new XmlDocument();
        doc.Load(inputFile);
        XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
        nsMgr.AddNamespace("ms", "http://schemas.microsoft.com/developer/msbuild/2003");
        ...
        foreach (XmlNode xn in doc.DocumentElement.SelectNodes("//ms:PropertyGroup", nsMgr)) ...

Now, for the new format, I can do the following, and it still works with my VS2022 but I am not sure whether this is "legal", adding an empty namespace via AddNamespace?:

        XmlDocument doc = new XmlDocument();
        doc.Load(path);
        XmlElement root = doc.DocumentElement;
        XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
        if (!String.IsNullOrEmpty(root.NamespaceURI))
        {
            nsMgr.AddNamespace("ms", root.NamespaceURI);
        } else {
            // <Project Sdk="Microsoft.NET.Sdk">
            // Just register ms: as empty
            nsMgr.AddNamespace("ms", "");
        }

I found this other answer:

Use of an empty string in a namespace declaration has a specific meaning in XML Namespaces 1.1: it turns it into an "undeclaration", so within its scope, the prefix "em" is not associated with any URI. But ... XML Namespaces 1.0 explicitly says (§2.2): The empty string, though it is a legal URI reference, cannot be used as a namespace name.

... but this is from the XML side of things, not the parser / parser helper side.

Q: Using C# System.Xml.*, how do I XPath-query two different XML files that have the same structure, but one declares a toplevel namespace and the other does not?

Jonathan Dodds
  • 2,654
  • 1
  • 10
  • 14
Martin Ba
  • 37,187
  • 33
  • 183
  • 337
  • Maybe some useful comments or links here: https://stackoverflow.com/questions/7178111/why-is-xmlnamespacemanager-necessary – Martin Ba Jun 12 '23 at 12:40

3 Answers3

2

If you have tested and determined that with the library you are using, System.Xml, setting an empty namespace works as desired, then use that approach.

Create unit tests that will fail if a future version of the library changes behavior. But because the current behavior is within the standard and because changing the behavior would be a breaking change, it is very unlikely that the library would ever change behavior on this.

MSBuild Files

It is clear that you are hand parsing MSBuild files. To what end is not explained.

If you are changing property values within the file via XPath, don't. Define your properties so that they can be overridden and pass the property values to MSBuild via environment variables and/or the command line /property switch.

If you are querying for specific settings or values, you can extend your projects with a custom target that reports the information and use MSBuild to run that target. You can write your custom target once and share it across projects by using a Directory.Build.targets file.

For something more complex, you may consider using the Microsoft.Build library. The library supports both legacy style projects and SDK style projects.

Jonathan Dodds
  • 2,654
  • 1
  • 10
  • 14
0

Although it's more verbose, and the XPath is a bit ugly - you could make them namespace-agnostic by matching generically on * and then use a predicate filter to restrict by the local-name() of the element.

For example:

//*[local-name() = "PropertyGroup"]

instead of:

//ms:PropertyGroup

Then you don't need to worry about registering a namespace-prefix and what namespace is bound to it.

Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • "a bit" ugly is a major understatement - Any non-trivial XPath expression will become utterly unreadable :-D ... but thanks. I could have mentioned that I already discarded that idea. I guess it's a valid approach, but not something I consider readable :-) – Martin Ba Jun 12 '23 at 14:57
0

It is better to use LINQ to XML API. It is available in the .Net Framework since 2007.

The code below is generic. It will get a default namespace vis GetDefaultNamespace() call if it is there. And ignore it if it is not there.

c#

void Main()
{
    const string filePath = @"e:\Temp\input.xml";
    XDocument xdoc = XDocument.Load(filePath);
    
    XNamespace ns = xdoc.Root.GetDefaultNamespace();
    
    foreach (XElement PropertyGroup in xdoc.Descendants(ns + "PropertyGroup"))
    {
        Console.WriteLine(PropertyGroup);       
    }
}
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21