4

I have an XML packet received from a third-party web server:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <SomeResponse xmlns="http://someurl">
      <SomeResult>
        .....
      </SomeResult>
    </SomeResponse>
  </soap:Body>
</soap:Envelope>

To be cross-platform capable, this XML is loaded into Delphi's IXMLDocument:

XmlDoc.LoadFromXML(XmlString);

I'm using a solution to find an XML node using XPath. The solution works in other cases, however when the XML document contains namespace prefixes, it fails.

I'm trying to access path:

/soap:Envelope/soap:Body/SomeResponse/SomeResult

From the linked answer:

function selectNode(xnRoot: IXmlNode; const nodePath: WideString): IXmlNode;
var
  intfSelect : IDomNodeSelect;
  dnResult : IDomNode;
  intfDocAccess : IXmlDocumentAccess;
  doc: TXmlDocument;
begin
  Result := nil;
  if not Assigned(xnRoot) or not Supports(xnRoot.DOMNode, IDomNodeSelect, intfSelect) then
    Exit;
  dnResult := intfSelect.selectNode(nodePath);
  if Assigned(dnResult) then
  begin
    if Supports(xnRoot.OwnerDocument, IXmlDocumentAccess, intfDocAccess) then
      doc := intfDocAccess.DocumentObject
    else
      doc := nil;
    Result := TXmlNode.Create(dnResult, nil, doc);
  end;
end;

It fails at dnResult := intfSelect.selectNode(nodePath); with EOleException: Reference to undeclared namespace prefix: 'soap'

How do I make this work when the node names have a namespace prefix?

Community
  • 1
  • 1
Jerry Dodge
  • 26,858
  • 31
  • 155
  • 327
  • 1
    You need to somehow tell the XPath processor about the namespace URLs used in the document. The names used in the document aren't important (which is good since the SomeResponse node's namespace isn't named). Some XML libraries have functions that take namespace mappings; perhaps this one does, too. Then you'd pick a name for the `http://someurl` namespace (e.g., `foo`), and then use that same chosen name in your XPath query (e.g., `foo:SomeResponse`). You'd need to include names for the other namespaces, too. – Rob Kennedy Jun 07 '15 at 00:01
  • easiest way to get `SomeResult` is to use `//SomeResponse/SomeResult` instead of using the full path. `SelectionNamespaces` property for XmlDoc might also help. Look [here](http://stackoverflow.com/questions/1519416/delphi-msxml-xpath-queries-fail) – kobik Jun 07 '15 at 10:08
  • @kobik: please note that `SelectionNamespaces` is specific for the msxml DOM provider and that the OP is looking for a xplat solution ;) – whosrdaddy Jun 07 '15 at 10:59

5 Answers5

3

Do not try to include namespaces in your XPath query. If all you want is the text of the SomeResult node, then you can use '//SomeResult' as query. For some reason the default xml implementation (msxml) barfs on the default namespace xmlns="http://someurl" on the SomeResponse parentnode. However, using OmniXML as the DOMVendor (= Crossplatform and valid from XE7 - thanks to @gabr) this works:

program Project3;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  Xml.XmlIntf,
  Xml.XMLDoc,
  Xml.XMLDom,
  Xml.omnixmldom,
  System.SysUtils;

const
 xml = '<?xml version="1.0" encoding="utf-8"?>'+#13#10+
        '<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"'+#13#10+
        'xmlns:xsd="http://www.w3.org/2001/XMLSchema"'+#13#10+
        'xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">'+#13#10+
        ' <soap:Body>'+#13#10+
        '  <SomeResponse xmlns="http://tempuri.org">'+#13#10+
        '   <SomeResult>1</SomeResult>'+#13#10+
        '  </SomeResponse>'+#13#10+
        ' </soap:Body>'+#13#10+
        '</soap:Envelope>';

function selectNode(xnRoot: IXmlNode; const nodePath: WideString): IXmlNode;
var
  intfSelect : IDomNodeSelect;
  dnResult : IDomNode;
  intfDocAccess : IXmlDocumentAccess;
  doc: TXmlDocument;
begin
  Result := nil;
  if not Assigned(xnRoot) or not Supports(xnRoot.DOMNode, IDomNodeSelect, intfSelect) then
    Exit;
  dnResult := intfSelect.selectNode(nodePath);
  if Assigned(dnResult) then
  begin
    if Supports(xnRoot.OwnerDocument, IXmlDocumentAccess, intfDocAccess) then
      doc := intfDocAccess.DocumentObject
    else
      doc := nil;
    Result := TXmlNode.Create(dnResult, nil, doc);
  end;
end;

function XPathQuery(Doc : IXMLDocument; Query : String) : String;

var
 Node : IXMLNode;

begin
 Result := '';
 Node := SelectNode(Doc.DocumentElement, Query);
 if Assigned(Node) then
  Result := Node.Text
end;

var
 Doc : IXMLDocument;

begin
 DefaultDOMVendor := sOmniXmlVendor;
 Doc := TXMLDocument.Create(nil);
 try
  Doc.LoadFromXML(Xml);
  Writeln(Doc.XML.Text);
  Writeln(XPathQuery(Doc, '//SomeResult'));
 except
  on E: Exception do
   Writeln(E.ClassName, ': ', E.Message);
 end;
 Doc := nil;
 Readln;
end.
whosrdaddy
  • 11,720
  • 4
  • 50
  • 99
  • 1
    @JerryDodge: I removed the COM initialization routine and unit as they were leftovers from testing and are not needed here. – whosrdaddy Jun 07 '15 at 17:59
  • OmniXML has VERY limited support for XPath. Even to have simple `X<>Y` condition I had to patch the library. Kluug's OXML site has a speed shootout of some different XML libs for Delphi, so might be a starting point to explore alternatives. – Arioch 'The Sep 26 '16 at 14:27
2

When I tried this a couple of years ago, I found namespace lookup in XPath was different between xml providers.

If I remember correctly, the Msxml lets you just use the namespace prefixes as they are defined in the xml file.

The ADOM 4 provider requires that you resolve namespace prefixes used in your XPath query to the actual namespaces, independent of the namespace mapping used in the xml file. There is a method pointer for that purpose, OnOx4XPathLookupNamespaceURI. Then you can have a name lookup function like this:

procedure TTestXmlUtil.EventLookupNamespaceURI(
  const AContextNode: IDomNode; const APrefix: WideString;
  var ANamespaceURI: WideString);
begin
  if APrefix = 'soap' then
    ANamespaceURI := 'http://schemas.xmlsoap.org/soap/envelope/'
  else if APrefix = 'some' then
    ANamespaceURI := 'http://someurl'
end;

Using this lookup function, and the selectNode function (which looks like something I may have once posted in a Delphi forum, taken from https://github.com/Midiar/adomxmldom/blob/master/xmldocxpath.pas), I could do the following test (using your xml in a string constant):

procedure TTestXmlUtil.SetUp;
begin
  inherited;
  DefaultDOMVendor := sAdom4XmlVendor;
  docFull := LoadXmlData(csSoapXml);

  OnOx4XPathLookupNamespaceURI := EventLookupNamespaceURI;
end;

procedure TTestXmlUtil.Test_selectNode;
var
  xn: IXmlNode;
begin
  xn := selectNode(docFull.DocumentElement, '/soap:Envelope/soap:Body/some:SomeResponse/some:SomeResult');
  CheckNotNull(xn, 'selectNode returned nil');
end;

I had to modify you XPath query a little for the default namespace.

Midiar
  • 31
  • 3
  • Looks promising, as if this is the appropriate solution. However Whosrdaddy's solution above was much easier and cleaner. – Jerry Dodge Jun 07 '15 at 16:56
1

As others have pointed out, different vendors handle namespaces differently. Here is an example using MSXML (the windows default) DOMVendor: (which I DO realise is not exactly what the OP was asking, but I felt it was worth documenting)

XML:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <SomeResponse xmlns="http://someurl">
      <SomeResult>
        Some result here
      </SomeResult>
    </SomeResponse>
  </soap:Body>
</soap:Envelope>

Selection code (for completeness)

// From a post in Embarcadero's Delphi XML forum.
function selectNode(xnRoot: IXmlNode; const nodePath: WideString): IXmlNode;
var
  intfSelect : IDomNodeSelect;
  dnResult : IDomNode;
  intfDocAccess : IXmlDocumentAccess;
  doc: TXmlDocument;
begin
  Result := nil;
  if not Assigned(xnRoot) or not Supports(xnRoot.DOMNode, IDomNodeSelect, intfSelect) then
    Exit;
  dnResult := intfSelect.selectNode(nodePath);
  if Assigned(dnResult) then
  begin
    if Supports(xnRoot.OwnerDocument, IXmlDocumentAccess, intfDocAccess) then
      doc := intfDocAccess.DocumentObject
    else
      doc := nil;
    Result := TXmlNode.Create(dnResult, nil, doc);
  end;
end;

Actual setting of XML search namespaces:

uses Winapi.MSXMLIntf; // NOTE: Use this version of the interface. MSXML2_TLB won't work.
...
procedure TForm1.DoExampleSearch;
var fnd:IXmlNode;
    doc:IXmlDomDocument2;
    msdoc:TMSDOMDocument;
const searchnames = 'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" '+
                    'xmlns:xsd="http://www.w3.org/2001/XMLSchema" '+
                    'xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" '+
                    'xmlns:some="http://someurl"';

begin
  if Xmldocument1.DOMDocument is TMSDOMDocument then
  begin
    msdoc:=Xmldocument1.DOMDocument as TMSDOMDocument;
    doc:=(msdoc.MSDocument as IXMLDOMDocument2);
    doc.setProperty('SelectionLanguage', 'XPath');
    doc.setProperty('SelectionNamespaces',searchNames);
  end;
  fnd:=selectNode(XmlDocument1.DocumentElement,'/soap:Envelope/soap:Body/some:SomeResponse/some:SomeResult');
  if (fnd=nil) then showmessage('Not found') else showmessage('Found: '+fnd.Text);
end;

Couple of things worth noting: once you add namespaces into the mix at all, Xpath seems to insist on them for everything. Note that I added a 'some' namespace for the search criteria, because the SomResult inherited it from its parent, and I have yet to get XPath to implicitly handle default namespaces.

Robbie Matthews
  • 1,404
  • 14
  • 22
0

One solution could be to remove all namespaces before you start processing your XML:

class function TXMLHelper.RemoveNameSpaces(XMLString: String): String;
const
  // An XSLT script for removing the namespaces from any document.
  // From http://wiki.tei-c.org/index.php/Remove-Namespaces.xsl
  cRemoveNSTransform =
    '<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">' +
    '<xsl:output method="xml" encoding="utf-8"/>' +

    '<xsl:template match="/|comment()|processing-instruction()">' +
    '    <xsl:copy>' +
    '      <xsl:apply-templates/>' +
    '    </xsl:copy>' +
    '</xsl:template>' +

    '<xsl:template match="*">' +
    '    <xsl:element name="{local-name()}">' +
    '      <xsl:apply-templates select="@*|node()"/>' +
    '    </xsl:element>' +
    '</xsl:template>' +

    '<xsl:template match="@*">' +
    '    <xsl:attribute name="{local-name()}">' +
    '      <xsl:value-of select="."/>' +
    '    </xsl:attribute>' +
    '</xsl:template>' +

    '</xsl:stylesheet>';

var
  Doc, XSL, Res: IXMLDocument;
  UTF8: UTF8String;
begin
   try
     Doc := LoadXMLData(XMLString);
     XSL := LoadXMLData(cRemoveNSTransform);
     Res := NewXMLDocument;
     Doc.Node.TransformNode(XSL.Node,Res);  // Param types IXMLNode, IXMLDocument
     Res.SaveToXML(Utf8);      // This ensures that the encoding remains utf-8
     Result := String(UTF8);
   except
     on E:Exception do Result := E.Message;
   end;
end; { RemoveNameSpaces }

(TXMLHelper is a helper class that I have with some useful XML handling functions)

Jan Doggen
  • 8,799
  • 13
  • 70
  • 144
  • 3
    Taking a break from my computer for a bit, will try soon, but just seems like a strange work-around. – Jerry Dodge Jun 06 '15 at 21:43
  • Yep, but in my case I did not need namespaces at all when processing the incoming XML. It even simplified parsing/debugging/logging because all that 'clutter' was gone. – Jan Doggen Jun 07 '15 at 10:01
0

The OmniXML solution:

I can absolutely confirm the OmniXML XPath does NOT support namespaces per se.

BUT:

since it treats the nodenames as literals, 'soap:Envelope' will work in a query PROVIDED the name in the xml document IS soap:Envelope. So in the OP example, the OmniXML search path '/soap:Envelope/soap:Body/SomeResponse/SomeResult' would work.

Note that you can absolutely NOT rely on inherited or default namespaces, OmniXML matches on the literal nodename.

You could fairly easily implement a loop to either remove or normalize all namespace tags in your document without too much effort.

Robbie Matthews
  • 1,404
  • 14
  • 22