I am using PHP to retrieve content for a given URL and XPATH. I use DOMDocument / DOMXPath (with query or evaluate).
For small xpath, I obtain correct result, but for longer xpath, it does not work. (And this xpath seems to be good (I obtained them with Xpather (firefox plugin) and re-test them with YQL).
Do you have any advice on this curious trouble ?
Example of code:
$doc = new DOMDocument();
$myXMLString = file_get_contents('http://stackoverflow.com/questions/4097230/too-long-xpath-with-domxpath-query-evaluate-return-nothing');
@$doc->loadHTML($myXMLString); //@ to suppress warnings
//(good for not ending markup)
$xpath = new DOMXPath($doc);
$fullPath ="/html/body/small/path"; //it works
//$fullPath = "/html/body/full/path/with/lot/of/markup";//does not works
$entries = $xpath->query($fullPath);
//or ->evalutate($fullPath) (same behaviour)
//$entries return DOMNodeList (empty for a long path query,
// correct for a small path query)
I test with attribute restriction, but is seems to not change (with small xpath it works, with longer it do not works more)
Example : for this current page:
$fullPath = "/html
/body
/div[4]
/div[@id='content']
/div[@id='question-header']
/h1
/a";//works (retrieve the question title)
$fullPath = "/html
/body
/div[4]
/div[@id='content']
/div[@id='mainbar']
/div[@id='question']
/table
/tbody
/tr[2]
/td[2]
/div[@id='comments-4097230']
/table
/tbody
/tr[@id='comment-4408626']
/td[2]
/div
/a"; //does'nt work
//(should retrieve 'gaby' from comment)
Edit:
I test with SimpleXML lib, and I have exactly the same behavior (good result for small query, nothing for long query).
Edit 2:
I also cut the longest xpath by deleting some first element and it works. BTW I really do not understand why a full correct xpath does not work.