3

I am trying to parse a xhtml file in ios using touchXML. I am trying it for first time. I have following format in XHTML:

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en">
  <head>
    <meta http-equiv="default-style" content="text/html; charset=utf-8"/>
    <title>Contents</title>
    <link rel="stylesheet" href="css/igp-widget-world.css" type="text/css"/>
  </head>
  <body>

    <nav epub:type="toc"><h2>Contents</h2>
      <ol epub:type="list">
        <li><a href="s001-BookTitlePage-01.xhtml">Widget World</a></li>
        <li><a href="s002-Copyright-01.xhtml">Copyright</a></li>
        <li><a href="s003-TableOfContents-01.xhtml">Contents</a></li>
        <li><a href="s004-Introduction-01.xhtml">Introduction</a></li>
        <li><a href="s005-Part-001.xhtml">PART 1: EVENTS AND COMMANDS</a></li>
        <li><a href="s006-Topic-001.xhtml">1: Timeline Events</a></li>
        <li><a href="s007-Topic-002.xhtml">2: Counter Events</a></li>
        <li><a href="s008-Topic-003.xhtml">3: Sprite Events</a></li>
        <li><a href="s009-Part-002.xhtml">PART 2: QUESTIONS AND ANSWERS</a></li>
        <li><a href="s010-Topic-004.xhtml">4: QAA True-False</a></li>
        <li><a href="s011-Topic-005.xhtml">5: QAA True-False Multi</a></li>
        <li><a href="s012-Topic-006.xhtml">6: QAA-Multichoice</a></li>
        <li><a href="s013-Topic-007.xhtml">7: QAA Multi-Response</a></li>
        <li><a href="s014-Topic-008.xhtml">8: QAA Association</a></li>
        <li><a href="s015-Topic-009.xhtml">9: QAA Sequence</a></li>
        <li><a href="s016-Topic-010.xhtml">10: QAA-Textmatch</a></li>
        <li><a href="s017-Topic-011.xhtml">11: QAA-Textmatch Multi</a></li>
        <li><a href="s018-Topic-012.xhtml">12: QAA Sort Word</a></li>
        <li><a href="s019-Topic-013.xhtml">13: QAA Set</a></li>
        <li><a href="s020-Part-003.xhtml">PART 3: INTERACTIVE WIDGETS</a></li>
        <li><a href="s021-Topic-014.xhtml">14: Widgets: Horizontal Sliding Panel</a></li>
        <li><a href="s022-Topic-015.xhtml">15: Widgets: Vertical Sliding Panel</a></li>
        <li><a href="s023-Topic-016.xhtml">16: Widgets: Horizontal Tutorial Panel</a></li>
        <li><a href="s024-Topic-017.xhtml">17: Widgets: Vertical Tutorial Panel</a></li>
        <li><a href="s025-Topic-018.xhtml">18: Widgets: Vertical Scrolling Panel</a></li>
        <li><a href="s026-Topic-019.xhtml">19: Widgets: Horizontal Scrolling Panel</a></li>
        <li><a href="s027-Topic-020.xhtml">20: Widgets: XY Scrolling Panel</a></li>
        <li><a href="s028-Topic-021.xhtml">21: Widgets: Locked Panel</a></li>
        <li><a href="s029-Topic-022.xhtml">22: Widgets: Popups</a></li>
        <li><a href="s030-Topic-023.xhtml">23: Widgets: Reveal</a></li>
        <li><a href="s031-Topic-024.xhtml">24: Widgets: PopUp Panel</a></li>
        <li><a href="s032-Colophon-01.xhtml">Colophon</a></li>
      </ol>
    </nav>
  </body>
</html>

This is my xhtml file. I implemented as follow:

CXMLDocument* xmlDoc = [[CXMLDocument alloc] initWithContentsOfURL:myURL options:0 error:nil];
NSString* xpath = [NSString stringWithFormat:@"//html:a[contains(@href,'%@')]/../html:a", myValue];
NSArray* navPoints = [ncxToc nodesForXPath:xpath namespaceMappings:[NSDictionary dictionaryWithObject:@"http://www.w3.org/1999/xhtml/" forKey:@"xhtml"] error:nil];

I am trying to find value of <a> where <a href = my value.

I am not sure where I am wrong the error is:

XPath error : Undefined namespace prefix
XPath error : Invalid expression 

Than I changed the xpath as follow:

NSString* xpath = [NSString stringWithFormat:@"//html/a[contains(@href,'%@')]/../html/a", myValue];

: replaced by /. It is not giving me error as above but not getting my content either.

I don't know about XPATH. In XML I only know about NSXMLParser. Please help me to identify what's wrong in this?

Update

As the answers I got till now I updated my code and did as follow:

NSString* xpath = [NSString stringWithFormat:@"//html:a[@href = '%@']", href];
NSArray* navPoints = [ncxToc nodesForXPath:xpath namespaceMappings:[NSDictionary dictionaryWithObject:@"http://www.w3.org/1999/xhtml/" forKey:@"html"] error:nil];

and

NSString* xpath = [NSString stringWithFormat:@"//xhtml:a[@href = '%@']", href];
NSArray* navPoints = [ncxToc nodesForXPath:xpath namespaceMappings:[NSDictionary dictionaryWithObject:@"http://www.w3.org/1999/xhtml/" forKey:@"xhtml"] error:nil];

There is no error but I getting any object in the array. I don't think my XPath is working.

Update 2

The Xpath I am getting is as follow:

//xhtml:a[@href = 's001-BookTitlePage-01.xhtml'] 
or
//html:a[@href = 's001-BookTitlePage-01.xhtml']

also when I tried a xpath like: //html:a even though I am not getting all anchor tags.

** UPDATE 3**

I tried to check the XPath online http://www.xpathtester.com/test

I found following error on my XPath //html:a

Exception occurred evaluting XPath: //html:a. Exception: XPath expression uses unbound namespace prefix html

Don't know what it means.

Thanks

Kapil Choubisa
  • 5,152
  • 9
  • 65
  • 100
  • Not a duplicate, just realized registering the namespace was right below the XPath expression where I didn't expect it. – Jens Erat Jun 21 '13 at 08:31

2 Answers2

0

//a[@href="s001-Cover-01.xhtml"]/text()
returns 'Cover' But your document is malformed: <body> not closed

Andrei Shender
  • 2,487
  • 22
  • 15
0

You registered the namespace as "xhtml", but use the prefix "html" in your query.

Code for registering the namespace:

NSArray* navPoints = [ncxToc nodesForXPath:xpath namespaceMappings:[NSDictionary dictionaryWithObject:@"http://www.w3.org/1999/xhtml/" forKey:@"xhtml"] error:nil];

Your XPath query:

//html:a[contains(@href,'%@')]/../html:a

Change one of both lines to fit the prefixes:

//xhtml:a[contains(@href,'%@')]/../xhtml:a

Anyway, you should probably strip the last two axis steps which go back to the list item and look for anchor elements again. If there are two anchor elements (directly) inside the list item, you'd get both of them though only one matches the @href attribute. Also, you wrote that you want to compare if they equal, so no need for contain() if they're really equal:

//xhtml:a[@href = '%@']
Jens Erat
  • 37,523
  • 16
  • 80
  • 96
  • Thanks for your answer. Please check my updates in question I made after your response. – Kapil Choubisa Jun 21 '13 at 10:12
  • "but I getting any object in the array." - did you forget a "not"? Can you print the `xpath` variable so we know it is constructed correctly? What happens if you omit the whole predicate to fetch all anchor tags? – Jens Erat Jun 21 '13 at 11:03
  • `//html::a` uses the not registered html prefix again. For the other query you posted, I cannot find any data in any src tag: there is no "BookTitlePage". – Jens Erat Jun 23 '13 at 13:54
  • It's really strange. I am not getting any content in my array :( I tried `//html:a` and `//html::a` but no value in my array. Also I have a very long XHTML so I only posted a very less part for just display formatting. Please check the whole xhtml file I posted to question now. – Kapil Choubisa Jun 24 '13 at 04:49
  • The online evaluator will not work because of non-declared namespaces. I'm using exactly these XPath expressions (with different syntax for namespace declaration of course) within BaseX and it works totally fine with the queries you posted in update 2. – Jens Erat Jun 24 '13 at 10:17