5

I'm trying to locate the value for <Link role="self"> in the following XML file using a XPath query:

<?xml version="1.0" encoding="utf-8"?>
<Response xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema"
          xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1">
    <Copyright>Copyright © 2011 Microsoft and its suppliers. All rights reserved. This API cannot be accessed and the content and any results may not be used, reproduced or transmitted in any manner without express written permission from Microsoft Corporation.</Copyright>
    <BrandLogoUri>http://spatial.virtualearth.net/Branding/logo_powered_by.png</BrandLogoUri>
    <StatusCode>201</StatusCode>
    <StatusDescription>Created</StatusDescription>
    <AuthenticationResultCode>ValidCredentials</AuthenticationResultCode>
    <TraceId>ID|02.00.82.2300|</TraceId>
    <ResourceSets>
        <ResourceSet>
            <EstimatedTotal>1</EstimatedTotal>
            <Resources>
                <DataflowJob>
                    <Id>ID</Id>
                    <Link role="self">https://spatial.virtualearth.net/REST/v1/dataflows/Geocode/ID</Link>
                    <Status>Pending</Status>
                    <CreatedDate>2011-03-30T08:03:09.3551157-07:00</CreatedDate>
                    <CompletedDate xsi:nil="true" />
                    <TotalEntityCount>0</TotalEntityCount>
                    <ProcessedEntityCount>0</ProcessedEntityCount>
                    <FailedEntityCount>0</FailedEntityCount>
                </DataflowJob>
            </Resources>
        </ResourceSet>
    </ResourceSets>
</Response>

I was shown an XPath query in a previous post, but I keep getting a unassigned iNode in the following code.

function TForm1.QueryXMLData(XMLFilename, XMLQuery: string): string;
var
  iNode : IDOMNode;
  Sel: IDOMNodeSelect;
begin
  try
    XMLDoc.Active := False;
    XMLDoc.FileName := XMLFilename;
    XMLDoc.Active := True;

    Sel := XMLDoc.DOMDocument as IDomNodeSelect;

    Result := '';
    iNode := Sel.selectNode('Link[@role = "self"]');
    if Assigned(iNode) then
      if (not VarisNull(iNode.NodeValue)) then
        Result := iNode.NodeValue;

    XMLDoc.Active := False;

  Except on E: Exception do
    begin
      MessageDlg(E.ClassName + ': ' + E.Message, mtError, [mbOK], 0);
      LogEvent(E.Message);
    end;
  end;
end;

What can I try to resolve this?

halfer
  • 19,824
  • 17
  • 99
  • 186
Pieter van Wyk
  • 2,316
  • 9
  • 48
  • 65
  • The problems are all related to the default namespace; Remove the `xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1"` and then this works just fine: `//Link[@role="self"][1]/node()`. I don't know why you've got `[3]` in your example since there's only one `Link` node in the document; I don't like the solution with removing the default namespace declaration, just seems so hacky... – Cosmin Prund Apr 01 '11 at 10:54
  • I have now removed the [3] from the query as that was clearly a mistake on my part. – Pieter van Wyk Apr 01 '11 at 11:25
  • possible duplicate of [Delphi/MSXML: XPath queries fail](http://stackoverflow.com/questions/1519416/delphi-msxml-xpath-queries-fail) –  Apr 01 '11 at 15:28

4 Answers4

12

If you want to locate Link anywhere in the document, you’ll have to prefix it with //; like this:

iNode := Sel.selectNode('//Link[@role = "self"][3]');

This will start searching at the root of the document, and traverse the entire document, looking for a node called Link matching the specified criteria.

See here for more operators: http://msdn.microsoft.com/en-us/library/ms256122.aspx

Note that, as Runner suggests, you can also query the full XML path. This will be faster than the // operator, since it won’t have to blindly search every node.


Edit: Why are you requesting the third matching node (the [3] bit)? AFAICS, there’s only one; if your real document does have more, and you’re certain you want the third, then it’s OK. Otherwise, remove the [3] query.


Also, depending on the XML implementation you’re using (vendor and version), you also might have to specify the XML namespace. In MSXML 4 thru 6 (IIRC), you’d have to use

XMLDoc.setProperty('SelectionNamespaces', 'xmlns:ns="http://schemas.microsoft.com/search/local/ws/rest/v1"');

This would mean using that prefix in your queries as well:

iNode := Sel.selectNode('//ns:Link[@role = "self"][3]');
Martijn
  • 13,225
  • 3
  • 48
  • 58
  • Is there a way to determine the XML implementation? – Pieter van Wyk Apr 01 '11 at 10:48
  • IIRC, `TXMLDocument` has a Vendor property, or an XMLImplementation property which has a Vendor property. No Delphi here atm, so I can’t look it up until later. – Martijn Apr 01 '11 at 10:55
  • I can set it to `MSXML` or `ADOM XML v4`. (`TXMLDocument.DOMVendor`). – Pieter van Wyk Apr 01 '11 at 11:18
  • +1 for `setProperty` from me as well, but I'd really love to see a working example, because I can't get it to work (and tried). I imported the MSXML 3.0 type library so I can get my hands on `setProperty` but I can't seem to find the proper syntax. – Cosmin Prund Apr 01 '11 at 11:28
  • Haven’t got much time atm, perhaps this a could help: http://stackoverflow.com/questions/263419/getting-started-with-xml-and-delphi – Martijn Apr 01 '11 at 11:51
  • It doesn't. It's the specific syntax required for `SelectionNamespaces` when you want to select the default namespace that's giving me trouble. Microsoft also managed to put quite a few examples on there website, unfortunately none of them selects the *default* namespace! – Cosmin Prund Apr 01 '11 at 12:08
  • Ah, it’s the default namespace you’re looking for. Sorry, that’s a bug, er, feature in MSXML: you’ll _have_ to use an explicit prefix in your XPath queries, just like I’ve done in my answer. It doesn’t matter if it matches the prefix in the actual document; SelectionNamespaces matches the full namespace URL with the prefixes used in your XPath queries. – Martijn Apr 01 '11 at 12:27
  • I tried specifying the namespace using `XML2.setProperty('SelectionNamespaces', 'xmlns:abc="http://schemas.microsoft.com/search/local/ws/rest/v1"')` and then querying with `WriteLn(XML2.selectSingleNode('//abc:Link[1]').nodeValue)`. I even tried quoting the "http" namespace with pascal-style-ticks (since most samples on the web are from C-style languages, with normal quotation for the outer string and using ticks for the inner string constant). So far nothing worked, the XPath query doesn't return anything. And I'm not affiliated with OP, I'm just extra-curious. – Cosmin Prund Apr 01 '11 at 12:56
  • @CosminPrund: Is it giving you an access violation? If so, then I’m baffled; if not, try to request the `text` property instead of the `nodeValue` one. I seem to remember that `nodeValue` will only really work on attributes and text nodes, not on elements. – Martijn Apr 01 '11 at 13:16
  • AV, that is, selectSingleNode returns nil. – Cosmin Prund Apr 01 '11 at 20:34
6

You should write it like this:

iNode := Sel.selectNode('//Link[@role = "self"]');

which will get you the first Link node in the document with attribute role="self" (even if there is more than one).

Or you can go the absolute path:

iNode := Sel.selectNode('/Response/ResourceSets/ResourceSet/Resources/DataflowJob/Link[@role = "self"]');

or even something in between

iNode := Sel.selectNode('//Resources/DataflowJob/Link[@role = "self"]');
Runner
  • 6,073
  • 26
  • 38
  • It’s true that a full XPath will have better performance, since it won’t have to search 'blindly' through the document. – Martijn Apr 01 '11 at 10:56
  • You're not handling the default namespace issue. – Cosmin Prund Apr 01 '11 at 10:56
  • @Cosmin Prund, true I only presented the correct XPath syntax. As I can see Martijn already has an answer that containts the namespace correctly. – Runner Apr 01 '11 at 18:20
  • Runner, even the OP got the proper syntax in the question! The namespace issue is at the root of the problem; Remove the namespace dclaration from the sample XML and the OP's code works. – Cosmin Prund Apr 01 '11 at 20:37
  • No he did not. iNode := Sel.selectNode('Link[@role = "self"]'); does not select the correct node. It searches for the Link node as the child of the root node. – Runner Apr 02 '11 at 08:17
  • Furthermore it depends on what XML parser he is using. For instance in OmniXML my answer works as OmniXML does not support namespaces and XPath would select the correct node. But I still agree of course that the namespace is an issue here. But as I already said Martinj already wrote that and I don't intend to repeat. – Runner Apr 02 '11 at 08:19
2

In the end I used OmniXML with the following code.

uses
    OmniXML, OmniXMLUtils, OmniXMLXPath;

  ...

    function GetResultsURL(Filename: string): string;
    var
      FXMLDocument: IXMLDocument;
      XMLElementList: IXMLNodeList;
      XMLNode: IXMLNode;
      XMLElement: IXMLElement;
      i: integer;
    begin
      //Create and load the XML document
      FXMLDocument := CreateXMLDoc;
      FXMLDocument.Load(Filename);

      //We are looking for: <Link role="output" name="failed">
      XMLElementList := FXMLDocument.GetElementsByTagName('Link');
      for i := 0 to Pred(XMLElementList.Length) do
        begin
          //Check each node and element
          XMLNode := XMLElementList.Item[i];
          XMLElement := XMLNode as IXMLElement;
          if XMLElement.GetAttribute('role') = 'output' then
            if Pos('failed', XMLNode.Text) > 0 then
                Result := XMLNode.Text;
        end;
    end;

The XML received looks like this ...

...

<DataflowJob>
  <Id>12345</Id>
  <Link role="self">https://spatial.virtualearth.net/REST/v1/dataflows/Geocode/12345</Link>
  <Link role="output" name="failed">https://spatial.virtualearth.net/REST/v1/dataflows/Geocode/12345/output/failed</Link>
  <Status>Completed</Status>
  <CreatedDate>2011-04-04T03:57:49.0534147-07:00</CreatedDate>
  <CompletedDate>2011-04-04T03:58:43.709725-07:00</CompletedDate>
  <TotalEntityCount>1</TotalEntityCount>
  <ProcessedEntityCount>1</ProcessedEntityCount>
  <FailedEntityCount>1</FailedEntityCount>
</DataflowJob>

...
Pieter van Wyk
  • 2,316
  • 9
  • 48
  • 65
1

Martijn mentioned about the Vendor property in a comment to his answer.

The property is in fact called DOMVendor.

Further below is some sample code that shows how that works.
The sample code depends on some helper classes you can find on bo.codeplex.com.

Note that DOMVendor will not tell you what version of MSXML you have, but you can ask it if it has XPath support.

Old MSXML versions (that are still in the field, for instance on plain vanilla Windows 2003 Server installations) won't support XPath, but support XSLPattern.
They will happily execute your queries, but sometimes return different results, or barf.

There are some sublte bugs in various sub-versions of MSXML6 too.
You need 6.30., 6.20.1103., 6.20.2003.0 or higher. 6.3 is only available on Windows 7/Windows 2008 Server. The 6.20 flavours on Windows XP and Windows 2003 Server.
Finding out which versions actually work took me a quite some time :-)

This shows you the installed MSXML, in my case msxml6.dll: 6.20.1103.0:

procedure TMainForm.ShowMsxml6VersionClick(Sender: TObject);
begin
{
Windows 2003 with MSXML 3: msxml3.dll: 8.100.1050.0

windows XP with MSXML 4: msxml4.dll: 4.20.9818.0

Windows XP with MSXML 6 SP1: msxml6.dll: 6.10.1129.0

windows XP with MSXML 6 SP2 (latest):
------------------------
msxml6.dll: 6.20.1103.0

Windows 7 with MSXML 6 SP3:
--------------------------
msxml6.dll: 6.30.7600.16385
}
  try
    Logger.Log(TmsxmlFactory.msxmlBestFileVersion.ToString());
    TmsxmlFactory.AssertCompatibleMsxml6Version();
  except
    on E: Exception do
    begin
      Logger.Log('Error');
      Logger.Log(E);
    end;
  end;
end;

This shows the DOMVendor code, it makes some use of helper classes, you can find that on

procedure TMainForm.FillDomVendorComboBox;
var
  DomVendorComboBoxItemsCount: Integer;
  Index: Integer;
  CurrentDomVendor: TDOMVendor;
  DefaultDomVendorIndex: Integer;
  CurrentDomVendorDescription: string;
const
  NoSelection = -1;
begin
  DomVendorComboBox.Clear;
  DefaultDomVendorIndex := NoSelection;
  for Index := 0 to DOMVendors.Count - 1 do
  begin
    CurrentDomVendor := DOMVendors.Vendors[Index];
    LogDomVendor(CurrentDomVendor);
    CurrentDomVendorDescription := CurrentDomVendor.Description;
    DomVendorComboBox.Items.Add(CurrentDomVendorDescription);
    if DefaultDOMVendor = CurrentDomVendorDescription then
      DefaultDomVendorIndex := DomVendorComboBox.Items.Count - 1;
  end;
  DomVendorComboBoxItemsCount := DomVendorComboBox.Items.Count;
  if (DefaultDomVendorIndex = NoSelection) then
  begin
    if DefaultDOMVendor = NullAsStringValue then
    begin
      if DomVendorComboBoxItemsCount > 0 then
        DefaultDomVendorIndex := 0;
    end
    else
      DefaultDomVendorIndex := DomVendorComboBoxItemsCount - 1;
  end;
  DomVendorComboBox.ItemIndex := DefaultDomVendorIndex;
end;

procedure TMainForm.LogDomVendor(const CurrentDomVendor: TDOMVendor);
var
  CurrentDomVendorDescription: string;
  DocumentElement: IDOMElement;
  DomDocument: IDOMDocument; // xmldom.IDOMDocument is the plain XML DOM
  XmlDocument: IXMLDocument; // XMLIntf.IXMLDocument is the enrichted XML interface to the TComponent wrapper, which has a DOMDocument: IDOMDocument poperty, and allows obtaining XML from different sources (text, file, stream, etc)
  XmlDocumentInstance: TXMLDocument; // unit XMLDoc

  DOMNodeEx: IDOMNodeEx;
  XMLDOMDocument2: IXMLDOMDocument2;
begin
  CurrentDomVendorDescription := CurrentDomVendor.Description;
  Logger.Log('DOMVendor', CurrentDomVendorDescription);

  XmlDocumentInstance := TXMLDocument.Create(nil);
  XmlDocumentInstance.DOMVendor := CurrentDomVendor;
  XmlDocument := XmlDocumentInstance;

  DomDocument := CurrentDomVendor.DOMImplementation.createDocument(NullAsStringValue, NullAsStringValue, nil);

  XmlDocument.DOMDocument := DomDocument;
  XmlDocument.LoadFromXML('<document/>');
  DomDocument := XmlDocument.DOMDocument; // we get another reference here, since we loaded some XML now

  DocumentElement := DomDocument.DocumentElement;
  if Assigned(DocumentElement) then
  begin
    DOMNodeEx := DocumentElement as IDOMNodeEx;
    Logger.Log(DOMNodeEx.xml);
  end;

  if IDomNodeHelper.GetXmlDomDocument2(DomDocument, XMLDOMDocument2) then
  begin
    // XSLPattern versus XPath
    // see https://stackoverflow.com/questions/784745/accessing-comments-in-xml-using-xpath
    // XSLPattern is 0 based, but XPath is 1 based.
    Logger.Log(IDomNodeHelper.SelectionLanguage, string(XMLDOMDocument2.getProperty(IDomNodeHelper.SelectionLanguage)));
    Logger.Log(IDomNodeHelper.SelectionNamespaces, string(XMLDOMDocument2.getProperty(IDomNodeHelper.SelectionNamespaces)));
  end;


  LogDomVendorFeatures(CurrentDomVendor,
    ['','1.0','2.0', '3.0'],
//http://www.w3.org/TR/DOM-Level-3-Core/introduction.html#ID-Conformance
//http://reference.sitepoint.com/javascript/DOMImplementation/hasFeature
['Core'
,'XML'
,'Events'
,'UIEvents'
,'MouseEvents'
,'TextEvents'
,'KeyboardEvents'
,'MutationEvents'
,'MutationNameEvents'
,'HTMLEvents'
,'LS'
,'LS-Async'
,'Validation'
,'XPath'
]);
end;


procedure TMainForm.LogDomVendorFeatures(const CurrentDomVendor: TDOMVendor; const Versions, Features: array of string);
var
  AllVersions: string;
  Feature: string;
  Line: string;
  Supported: Boolean;
  SupportedAll: Boolean;
  SupportedNone: Boolean;
  SupportedVersions: IStringListWrapper;
  Version: string;
begin
  SupportedVersions := TStringListWrapper.Create();
  for Version in Versions do
    AddSupportedVersion(Version, SupportedVersions);
  AllVersions := Format('All: %s', [SupportedVersions.CommaText]);
  for Feature in Features do
  begin
    SupportedAll := True;
    SupportedNone := True;
    SupportedVersions.Clear();
    for Version in Versions do
    begin
      Supported := CurrentDomVendor.DOMImplementation.hasFeature(Feature, Version);
      if Supported then
        AddSupportedVersion(Version, SupportedVersions);
      SupportedAll := SupportedAll and Supported;
      SupportedNone := SupportedNone and not Supported;
    end;
    if SupportedNone then
      Line := Format('None', [])
    else
    if SupportedAll then
      Line := Format('%s', [AllVersions])
    else
      Line := Format('%s', [SupportedVersions.CommaText]);
    Logger.Log('  ' + Feature, Line);
  end;
end;

Delphi XE will show these:

DOMVendor:MSXML
<document/>
SelectionLanguage:XPath
SelectionNamespaces:
  Core:None
  XML:Any,1.0
  Events:None
  UIEvents:None
  MouseEvents:None
  TextEvents:None
  KeyboardEvents:None
  MutationEvents:None
  MutationNameEvents:None
  HTMLEvents:None
  LS:None
  LS-Async:None
  Validation:None
  XPath:Any,1.0
DOMVendor:ADOM XML v4
?<document></document>

  Core:None
  XML:None
  Events:None
  UIEvents:None
  MouseEvents:None
  TextEvents:None
  KeyboardEvents:None
  MutationEvents:None
  MutationNameEvents:None
  HTMLEvents:None
  LS:None
  LS-Async:None
  Validation:None
  XPath:None
Community
  • 1
  • 1
Jeroen Wiert Pluimers
  • 23,965
  • 9
  • 74
  • 154