0

I have this code sample (full code here):

$raw = " ( xml data ) ";

$nsURIs = array
(
  'opf'     => 'http://www.idpf.org/2007/opf',
  'dc'      => 'http://purl.org/dc/elements/1.1/',
  'dcterms' => 'http://purl.org/dc/terms/',
  'xsi'     => 'http://www.w3.org/2001/XMLSchema-instance',
  'ncx'     => 'http://www.daisy.org/z3986/2005/ncx/',
  'calibre' => 'http://calibre.kovidgoyal.net/2009/metadata'
);

$dom   = new \DOMDocument();
$dom->loadXML( $raw,LIBXML_NOBLANKS );

$metadata = $dom->getElementsByTagName( 'metadata' )->item(0);         #metadata

$xpath = new \DOMXPath( $dom );
foreach( $nsURIs as $key => $ns ) $xpath->registerNamespace( $key, $ns );

$query = array();
$query[] = '//dc:identifier';                                          #00
$query[] = '//dc:identifier[. = "9780439554930"]';                     #01
$query[] = '//dc:identifier[@opf:scheme="ISBN"]';                      #02
$query[] = '//dc:identifier[starts-with(@opf:scheme,"I")]';            #03
$query[] = '//dc:identifier[contains(@opf:scheme,"SB")]';              #04
$query[] = '//dc:identifier[ends-with(@opf:scheme,"N")]';              #05 Unregistered
$query[] = '//dc:identifier["IS" = substring(@opf:scheme, 0, 2)]';     #06 Fails
$query[] = '//dc:identifier[contains(.,"439")]';                       #07
$query[] = '//dc:identifier[@*="ISBN"]';                               #08
$query[] = '//dc:date[contains(@*, "cation")]';                        #09
$query[] = '//dc:*[contains(@opf:*, "and")]';                          #10 Wrong Result
$query[] = '//dc:*[contains(@opf:file-as, "and")]';                    #11
$query[] = '//dc:*[contains(@opf:*, "ill")]';                          #12
$query[] = '//dc:contributor[@opf:role and @opf:file-as]';             #13
$query[] = '//dc:subject[contains(.,"anta") and contains(.,"Urban")]'; #14
$query[] = '//dc:subject[text() = "Fantasy"]';                         #15

for( $i=0; $i<count($query); $i++ )
{
  $result = $xpath->evaluate( $query[$i] );
  echo sprintf( "[%02d]  % 2d  %s\n", $i, $result->length, $query[$i] );
}

Query #5 Fails due to unregistered function; query #6 Fails (0 results insted of 1) and query #10 generate 1 item instead of 2 (as properly generated at following query #11). Same results performing query on $metadatacontext.

In this question I have found an alternative to unregistered ends-with:

$query[] = '//dc:identifier["N" = substring(@opf:scheme, string-length(@opf:scheme) - string-length("N"))]';

but even this hack fails...

Anyone have suggestion or alternatives?

Community
  • 1
  • 1
fusion3k
  • 11,568
  • 4
  • 25
  • 47
  • 2
    What is the question exactly? What are you trying to achieve? – har07 Jan 19 '16 at 03:46
  • My achieve is to check if XPath works fine. The question is: fails in provided code are XPath bugs or because of my errors? In xml code there are two nodes with text 'and' in attribute named `opf:file-as`: why query 10 return 1 item (incorrect) and query #11 return 2 items (ok)? And the hack to bypass unregistered `ends-with` function (marked as working in cited question) is not correct? Or I have to forget `ends-with` queries? I'm sorry if I was not clear. – fusion3k Jan 20 '16 at 00:58
  • regarding `ends-with()` function, you should've read [this](http://stackoverflow.com/questions/402211/how-to-use-xpath-function-in-a-xpathexpression-instance-programatically/402357#402357) (answer to the duplicated question you linked to) – har07 Jan 20 '16 at 01:39
  • Yes, I had read (and linked too)... but I have forget the `+ 1` at the end! Thanks! (The problem with query **#10** remains unsolved). – fusion3k Jan 20 '16 at 01:51
  • Side notes: 1. XPath expressions that doesn't pose a problem is only distracting the reader, I'd suggest to remove it form your question. 2. Multiple questions is normally better posted separately : http://meta.stackexchange.com/questions/39223/one-post-with-multiple-questions-or-multiple-posts (consider these in your future questions). – har07 Jan 20 '16 at 02:08
  • Ok, Thanks for suggestions – fusion3k Jan 20 '16 at 02:20

1 Answers1

1

Regarding #06 Fails :

//dc:identifier["IS" = substring(@opf:scheme, 0, 2)]

Explanation :

XPath index start from 1 instead of 0, so the correct parameters of subsstring() here would be :

//dc:identifier["IS" = substring(@opf:scheme, 1, 2)]

Regarding #10 Wrong Result :

//dc:*[contains(@opf:*, "and")]

Explanation :

When passing multiple values as function parameter in XPath 1.0, only the first value would be evaluated. So in this case, only the first attribute with prefix opf will be evaluated, hence the following element doesn't count :

<dc:contributor opf:role="ill" opf:file-as="GrandPré, Mary">Mary GrandPré</dc:contributor>

to avoid this problem, you should change the XPath to be as follow :

//dc:*[@opf:*[contains(., "and")]]
har07
  • 88,338
  • 12
  • 84
  • 137