1

My XML file looks like this:

<log>
  <entry entry_id="E200911115777">
    <entry_data>
      <entry_title>Lorem ipsum dolor</entry_title>
      <entry_date>1999-04-15</entry_date>
    </entry_data>
  </entry>
  <entry entry_id="E205011115999">
    <entry_data>
      <entry_title>Lorem ipsum dolor</entry_title>
      <entry_date>2004-12-15</entry_date>
    </entry_data>
  </entry>
  <entry entry_id="E199912119116">
    <entry_data>
      <entry_title>Lorem ipsum dolor</entry_title>
      <entry_date>1990-11-20</entry_date>
    </entry_data>
  </entry>
</log>

I'm looking for code that will return the highest value of the entry_date tag, in this case, 2004-12-15. I'm using SimpleXML but I'm open to other solutions of course. Cheers.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
Kerans
  • 115
  • 1
  • 17
  • Walk through the elements using the `getElementsByTagname` method, store the highest value until loop is finished. I don't think there is any simpler way – Pekka Jan 18 '11 at 13:04
  • 1
    @Pekka I am pretty sure you can do that with XPath and save on looping. The big question is if it's possible with XPath 1.0. I'm slapping an XPath onto the question. Maybe Dimitre wants to share anything. – Gordon Jan 18 '11 at 13:09
  • I'd prefer XMLReader over simpleXML for this. especially if it's a log file... – Your Common Sense Jan 18 '11 at 13:15
  • 1
    XPath 1.0 doesn't have date comparison possibilities. So you'd have to be doing something like this: http://stackoverflow.com/questions/3786443/xpath-to-get-the-element-with-the-highest-id but also using translate http://stackoverflow.com/questions/4347320/xpath-dates-comparison – James Walford Jan 18 '11 at 13:28
  • However, if it's really a log file, just `tail` would be enough. – Your Common Sense Jan 18 '11 at 13:30
  • @Gordon - Dimitre has already shared some of this stuff, see comment above :-) Any particular reason you specify XPath 1.0 rather than 2.0 though? Date handling is one of the major improvements in 2.0 – James Walford Jan 18 '11 at 13:32
  • @James thanks for finding these. All the PHP5 XML extensions use libxml which does not support XPath 2.0 – Gordon Jan 18 '11 at 13:35
  • 1
    @Gordon - that's a fair limitation then :-) , wasn't aware of it, thanks. – James Walford Jan 18 '11 at 13:38
  • Good question, +1. See my answer for two simple (and one of them efficient) XSLT 1.0 solutions. :) – Dimitre Novatchev Jan 18 '11 at 14:35
  • Possible duplicate of [How to to sort a XML feed with SimpleXML ](http://stackoverflow.com/questions/1798005/how-to-to-sort-a-xml-feed-with-simplexml) and [Sorting an array of SimpleXML objects](http://stackoverflow.com/questions/2119686/sorting-an-array-of-simplexml-objects) –  Jan 18 '11 at 17:05

3 Answers3

3

I. Here is a simple XSLT 1.0 solution that is closest to using a single XPath expression (it isn't possible to have just a single XPath 1.0 expression selecting the wanted node(s) ):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="entry">
  <xsl:copy-of select=
   "self::node()
      [not((preceding-sibling::entry | following-sibling::entry)
             [translate(*/entry_date,'-','')
             >
             translate(current()/*/entry_date,'-','')
             ]
           )
      ]
   "/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<log>
    <entry entry_id="E200911115777">
        <entry_data>
            <entry_title>Lorem ipsum dolor</entry_title>
            <entry_date>1999-04-15</entry_date>
        </entry_data>
    </entry>
    <entry entry_id="E205011115999">
        <entry_data>
            <entry_title>Lorem ipsum dolor</entry_title>
            <entry_date>2004-12-15</entry_date>
        </entry_data>
    </entry>
    <entry entry_id="E199912119116">
        <entry_data>
            <entry_title>Lorem ipsum dolor</entry_title>
            <entry_date>1990-11-20</entry_date>
        </entry_data>
    </entry>
</log>

the wanted, correct result is produced:

<entry entry_id="E205011115999">
   <entry_data>
      <entry_title>Lorem ipsum dolor</entry_title>
      <entry_date>2004-12-15</entry_date>
   </entry_data>
</entry>

II. A more efficient XSLT 1.0 solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/*">
  <xsl:apply-templates>
   <xsl:sort order="descending"/>
  </xsl:apply-templates>
 </xsl:template>

 <xsl:template match="entry">
  <xsl:if test="position() = 1">
   <xsl:copy-of select="."/>
  </xsl:if>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the same XML document (above), again the wanted, correct result is produced:

<entry entry_id="E205011115999">
   <entry_data>
      <entry_title>Lorem ipsum dolor</entry_title>
      <entry_date>2004-12-15</entry_date>
   </entry_data>
</entry>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
1

Yeah, should be quite easy with xpath, that is definately the way to go, and simple xml works well with xpath in php.

Check out the docs here: http://www.php.net/manual/en/simplexmlelement.xpath.php

$xml = new SimpleXMLElement($string);

/* Search for <log><entry><entry_data><entry_date> */
$result = $xml->xpath('/log/entry/entry_data/entry_date');

while(list( , $node) = each($result)) {
    $timestamp = strtotime((string) $node));
    echo '/log/entry/entry_data/entry_date: ' . $timestamp ."\n";
}

I didn't actually test that code, but should be pretty close to what you need, and timestamps of course have their limits but seems ok for your use.

Tom Gruner
  • 9,635
  • 1
  • 20
  • 26
1
$result = $xml->xpath('//entry_date');

usort($result,'strcmp');

$maxdate = end($result);
user570783
  • 686
  • 4
  • 7
  • Fantastic! This solved the issue perfectly! Thanks to everyone for their contribution. Sorry for this late comment, I've been very busy. – Kerans Feb 05 '11 at 23:51