47

Consider this simple XML document. The serialized XML shown here is the result of an XmlSerializer from a complex POCO object whose schema I have no control over.

<My_RootNode xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="">
  <id root="2.16.840.1.113883.3.51.1.1.1" extension="someIdentifier" xmlns="urn:hl7-org:v3" /> 
  <creationTime xsi:nil="true" xmlns="urn:hl7-org:v3" />      
</My_RootNode>

The goal is to extract the value of the extension attribute on the id node. In this case, we are using the SelectSingleNode method, and given an XPath expression as such:

XmlNode idNode = myXmlDoc.SelectSingleNode("/My_RootNode/id");
//idNode is evaluated to null at this point in the debugger!
string msgID = idNode.Attributes.GetNamedItem("extension").Value;

The problem is that the SelectSingleNode method returns null for the given XPath expression.

Question: any ideas on this XPath query's correctness, or why this method call + XPath expression would return a null value? Perhaps the namespaces are part of the problem?

p.campbell
  • 98,673
  • 67
  • 256
  • 322
  • 1
    First thing to check is if the XML document has been loaded correctly. I can see an empty xmlns attribute at the end of the root node - is that right? – Oded Jul 06 '09 at 21:04
  • @Oded: Correct, we're looking at an XmlDocument which has loaded the string output of an XmlSerializer. – p.campbell Jul 06 '09 at 21:10
  • @pcampbell: is this a large document (HL7!)? If so, then you may want to try serializing directly into the XmlDocument. If you want a sample of that, let me know. – John Saunders Jul 06 '09 at 21:13

9 Answers9

59

I strongly suspect the problem is to do with namespaces. Try getting rid of the namespace and you'll be fine - but obviously that won't help in your real case, where I'd assume the document is fixed.

I can't remember offhand how to specify a namespace in an XPath expression, but I'm sure that's the problem.

EDIT: Okay, I've remembered how to do it now. It's not terribly pleasant though - you need to create an XmlNamespaceManager for it. Here's some sample code that works with your sample document:

using System;
using System.Xml;

public class Test
{
    static void Main()
    {
        XmlDocument doc = new XmlDocument();
        XmlNamespaceManager namespaces = new XmlNamespaceManager(doc.NameTable);
        namespaces.AddNamespace("ns", "urn:hl7-org:v3");
        doc.Load("test.xml");
        XmlNode idNode = doc.SelectSingleNode("/My_RootNode/ns:id", namespaces);
        string msgID = idNode.Attributes["extension"].Value;
        Console.WriteLine(msgID);
    }
}
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
17

If you want to ignore namespaces completely, you can use this:

static void Main(string[] args)
{
    string xml =
        "<My_RootNode xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns=\"\">\n" +
        "    <id root=\"2.16.840.1.113883.3.51.1.1.1\" extension=\"someIdentifier\" xmlns=\"urn:hl7-org:v3\" />\n" +
        "    <creationTime xsi:nil=\"true\" xmlns=\"urn:hl7-org:v3\" />\n" +
        "</My_RootNode>";

    XmlDocument doc = new XmlDocument();
    doc.LoadXml(xml);

    XmlNode idNode = doc.SelectSingleNode("/*[local-name()='My_RootNode']/*[local-name()='id']");
}
mrzli
  • 16,799
  • 4
  • 38
  • 45
14

This should work in your case without removing namespaces:

XmlNode idNode = myXmlDoc.GetElementsByTagName("id")[0];
tandrasz
  • 181
  • 1
  • 2
8

Sorry, you forgot the namespace. You need:

XmlNamespaceManager ns = new XmlNamespaceManager(myXmlDoc.NameTable);
ns.AddNamespace("hl7","urn:hl7-org:v3");
XmlNode idNode = myXmlDoc.SelectSingleNode("/My_RootNode/hl7:id", ns);

In fact, whether here or in web services, getting null back from an XPath operation or anything that depends on XPath usually indicates a problem with XML namespaces.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
  • Thanks John, actually the namespace is missing/blank in the test data! Do you suspect that's part of the problem? – p.campbell Jul 06 '09 at 21:04
  • 2
    I believe John is almost completely correct, because the full name of the "id" element is the pair "urn:h17-org:v3" and "id". You're searching for "" and "id" with your XPATH, so it won't find anything. However, to actually work, you need to pass the ns instance as the second parameter of SelectSingleNode. – Steven Sudit Jul 06 '09 at 21:10
  • Doh - spent all that time coming up with a test program, only to find you'd beaten me to it :) – Jon Skeet Jul 06 '09 at 21:11
  • @Jon: I should frame that. (ok, not really). Besides, Steven caught me leaving off the ",ns" – John Saunders Jul 06 '09 at 21:12
  • @Steven: good catch, and the most polite way of saying, "hey dummy, you forgot to use the object you just constructed" that I've heard in a while. "Almost completely correct" - I'll have to remember that. – John Saunders Jul 06 '09 at 21:15
3

Just to build upon solving the namespace issues, in my case I've been running into documents with multiple namespaces and needed to handle namespaces properly. I wrote the function below to get a namespace manager to deal with any namespace in the document:

private XmlNamespaceManager GetNameSpaceManager(XmlDocument xDoc)
    {
        XmlNamespaceManager nsm = new XmlNamespaceManager(xDoc.NameTable);
        XPathNavigator RootNode = xDoc.CreateNavigator();
        RootNode.MoveToFollowing(XPathNodeType.Element);
        IDictionary<string, string> NameSpaces = RootNode.GetNamespacesInScope(XmlNamespaceScope.All);

        foreach (KeyValuePair<string, string> kvp in NameSpaces)
        {
            nsm.AddNamespace(kvp.Key, kvp.Value);
        }

        return nsm;
    }
2

Well... I had the same issue and it was a headache. Since I didn't care much about the namespace or the xml schema, I just deleted this data from my xml and it solved all my issues. May not be the best answer? Probably, but if you don't want to deal with all of this and you ONLY care about the data (and won't be using the xml for some other task) deleting the namespace may solve your problems.

XmlDocument vinDoc = new XmlDocument();
string vinInfo = "your xml string";
vinDoc.LoadXml(vinInfo);

vinDoc.InnerXml = vinDoc.InnerXml.Replace("xmlns=\"http://tempuri.org\/\", "");
Roisgoen
  • 814
  • 1
  • 9
  • 16
  • This will only work for your particular data. It's not a general answer. – John Saunders Jan 21 '15 at 16:34
  • If you have control over the xsd, the xml and the code consuming it, it's an excellent example of one way to handle the problem. I've taken this answer and generalized it a bit by using a RegEx and uploaded that to this thread. – David Apr 14 '17 at 18:37
1

The rule to keep in mind is: if your document specifies a namespace, you MUST use an XmlNamespaceManager in your call to SelectNodes() or SelectSingleNode(). That's a good thing.

See the article Advantages of namespaces . Jon Skeet does a great job in his answer showing how to use XmlNamespaceManager. (This answer should really just be a comment on that answer, but I don't quite have enough Rep Points to comment.)

teo van kot
  • 12,350
  • 10
  • 38
  • 70
Erica Ackerman
  • 189
  • 1
  • 2
  • 11
0

just use //id instead of /id. It works fine in my code

dong
  • 1
-1

Roisgoen's answer worked for me, but to make it more general, you can use a RegEx:

//Substitute "My_RootNode" for whatever your root node is
string strRegex = @"<My_RootNode(?<xmlns>\s+xmlns([\s]|[^>])*)>";
var myMatch = new Regex(strRegex, RegexOptions.None).Match(myXmlDoc.InnerXml);
if (myMatch.Success)
{
    var grp = myMatch.Groups["xmlns"];
    if (grp.Success)
    {
        myXmlDoc.InnerXml = myXmlDoc.InnerXml.Replace(grp.Value, "");
    }
}

I fully admit that this is not a best-practice answer, but but it's an easy fix and sometimes that's all we need.

David
  • 4,665
  • 4
  • 34
  • 60