1

I have an XML file that contains the following type of data

<definition name="/products/phone" path="/main/something.jsp" > </definition>

There are dozens of nodes in the xml file.

What I want to do is extract the url under the 'name' parameter so my end result will be:

http://www.mysite.com/products/phone.jsp

Can I do this with a so called XML parser? I have no idea where to begin. Can someone steer me to a direction. What tools do I need to achieve something like that?

I am particularly interested in doing this with PHP.

Obi-Wan
  • 101
  • 1
  • 8

2 Answers2

1

It should be easy to append a path to an existing URL and expected resource type given the above basic XML.

If you are comfortable with C#, and you know there is one and only one "definition" element, here is a self contained little program that does what you require (and assumes you are loading the XML from a string):

using System;
using System.Xml;

public class parseXml
{
    private const string myDomain = "http://www.mysite.com/";
    private const string myExtension = ".jsp";

    public static void Main()
    {
        string xmlString = "<definition name='/products/phone' path='/main/something.jsp'> </definition>";

        XmlDocument doc = new XmlDocument();

        doc.LoadXml(xmlString);

        string fqdn =   myDomain +
                        doc.DocumentElement.SelectSingleNode("//definition").Attributes["name"].ToString() +
                        myExtension;

        Console.WriteLine("Original XML: {0}\nResultant FQDN: {1}", xmlString, fqdn);
    }
}

You are going to need to be careful with SelectSingleNode above; the XPath expression assumes there is only one "definition" node and that you are searching from the document root.

Fundamentally, it's worthwhile to read a primer on XML. Xml is not difficult, it's a self describing hierarchical data format - lots of nested text, angle brackets, and quotation marks :).

A good primer would probably be that at the W3 Schools: http://www.w3schools.com/xml/xml_whatis.asp

You may also want to read up on streaming (SAX/StreamReader) vs. loading (DOM/XmlDocument) Xml: What is the difference between SAX and DOM?

I can provide a Java example too, if you feel that would be helpful.

Community
  • 1
  • 1
AFKAP
  • 63
  • 4
  • This is very helpful for me to understand the logic. Much appreciated! Can you provide an example in PHP? The xml document contains dozens of "definition" nodes so I guess a for loop would be needed yes? Something like: `definition as $definition) { echo ..... } ?>` – Obi-Wan Aug 30 '13 at 14:06
  • Unfortunately I don't know PHP :(, but the logic will work like this: (1) Read the XML string/file into an XmlDocument object. (2) If you want a collection of all "definition" elements beneath the Xml root node, the above XPath expression will still work ("//definition") - so apply that XPath expression to the Xml Document to return a collection of "definition" elements. (3) Once you have the collection of "definition" elements, iterate through them using a foreach loop (as you do above) and construct your resultant FQDN's. - Does this help? – AFKAP Aug 30 '13 at 23:13
0

Not sure if you solved your problem, so here is a PHP solution:

$xml = <<<DATA
<?xml version="1.0"?>
<root>
<definition name="/products/phone" path="/main/something.jsp"> </definition>
<definition name="/products/cell" path="/main/something.jsp"> </definition>
<definition name="/products/mobile" path="/main/something.jsp"> </definition>
</root>
DATA;

$arr = array();
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($xml);

$xpath = new DOMXPath($dom);
$defs = $xpath->query('//definition');

foreach($defs as $def) { 
   $attr = $def->getAttribute('name');
   if ($attr != "") {
      array_push($arr, $attr);
   }
}
print_r($arr);

See IDEONE demo

Result:

Array
(
    [0] => /products/phone
    [1] => /products/cell
    [2] => /products/mobile
)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563