3

I'm an Xpath newbie. I want to loop through the result of a cURL query and print each element of the only table on the page.

I've used the Xpath plugin for Firefox to obtain my expression and my table is structured as follows:

<table>
<tr class="listItemOneBg">
    <td valign="top">
        SMITH
    </td>
    <td valign="top">
        WILLIAM C C
    </td>
    <td valign="top">
        Male
    </td>
    <td valign="top">

    </td>
    <td valign="top">

    </td>
    <td valign="top">

    </td>
    <td valign="top">

    </td>
    <td valign="top">
        BLACKWOOD
    </td>
    <td valign="top">
        61
    </td>
    <td valign="top">
        1924
    </td>
    <td valign="top">
        <a target="_blank" href='XXX'>
            order</a>
    </td>
</tr>

<tr class="listItemTwoBg">
    <td valign="top">
        SMITH
    </td>
    <td valign="top">
        WILLIAM C PAGE-
    </td>
    <td valign="top">
        Male
    </td>
    <td valign="top">

    </td>
    <td valign="top">

    </td>
    <td valign="top">

    </td>
    <td valign="top">

    </td>
    <td valign="top">
        SWAN
    </td>
    <td valign="top">
        9
    </td>
    <td valign="top">
        1914
    </td>
    <td valign="top">
        <a target="_blank" href='XXY'>
            order</a>
    </td>
</tr>       

Here's the code I've tried so far. I get a message"Warning: Invalid argument supplied for foreach()". What am I doing wrong?

$page = curl_exec($ch);
curl_close($ch);

// Create new PHP DOM document
$dom = new DOMDocument;
// Load html from curl request into document model
@$dom->loadHTML($page);
$xpath = new DOMXPath($dom);

$tableRows = $xpath->query("id('divResults')/table/tbody/tr");
foreach ($tableRows as $row) {
     // fetch all 'tds' inside this 'tr'
    $td = $xpath->query('td', $row);
    echo $td->item(1)->textContent;
}
user1801060
  • 2,733
  • 6
  • 25
  • 44
  • 1
    Why have you put extra single quotes around the XPath? What's the deal with the `x:` prefixes? – Jon Oct 06 '13 at 11:24
  • I've updated my code. As for the prefixes, I've used the output of the Xpath plugin to give me my expressions. – user1801060 Oct 06 '13 at 11:28
  • My expression now works although I cant loop yet – user1801060 Oct 06 '13 at 11:51
  • @user1801060 Try fetching all the way to `td` in one query. And see if it works. Also try `'//td'` in the second. – CodeAngry Oct 06 '13 at 12:20
  • possible duplicate of [Why does my XPath query (scraping HTML tables) only work in Firebug, but not the application I'm developing?](http://stackoverflow.com/questions/18241029/why-does-my-xpath-query-scraping-html-tables-only-work-in-firebug-but-not-the) – Jens Erat Oct 06 '13 at 13:42
  • 1
    Additionally: The HTML snippet should be containing the element (a div?) having that id. – Jens Erat Oct 06 '13 at 13:46
  • 1
    If you're editing your question very strongly, and it already contains other answers, you should somehow post that as a comment to prevent confusion like in @CodeAngry's answer -- or even maybe ask another question (the "one big" problem of your first question actually was solved). – Jens Erat Oct 07 '13 at 08:36

2 Answers2

4

Assuming the table you're after is actually in a <div id="divResults">...

$tableRows = $xpath->query('//div[@id="divResults"]/table/tbody/tr');
foreach ($tableRows as $row) {
    $cells = $row->getElementsByTagName('td');
}
Phil
  • 157,677
  • 23
  • 242
  • 245
  • Bless you! You've saved me several hours of grief – user1801060 Oct 07 '13 at 08:16
  • 1
    @user1801060 +1. And this is a proper XPath expression. Not the peculiar `ID('...')` thing you used which is the first time I've ever seen. – CodeAngry Oct 07 '13 at 15:25
  • what is the expression for `
    `? this expression is returning false `$time = $path->query('//div[@class="scheduleWrap"]/table/tbody/tr');`
    – Neocortex Dec 04 '14 at 06:13
  • @BannedfromSO Make sure the actual HTML source has a `` tag. Your browser's inspector tab is not an accurate representation of the source document – Phil Dec 04 '14 at 21:53
  • What so ever the PHP code going to read HTML doc on browser, so it should work and I realized what might have happen inside, but this is how I fixed. **`html tag which is enclosing the targetted content to extract, into $your_dom_obj_var->query('//HTML_ELEMENT[@ATTRIBUTE="VALUE"'])`** thats it nothing much after it. – Neocortex Dec 05 '14 at 04:23
1

That's a non-standard XPath expression. It cannot work in DOMXPath.
(Downvoters, the expression has been edited since the question was posted. Cheers!)

This is where you learn XPath:

PS: It's where I learnt it.

CodeAngry
  • 12,760
  • 3
  • 50
  • 57
  • This is a standard XPath 1.0 (and upwards) expression which is totally fine in DOMXPath, apart not matching the input document. – Jens Erat Oct 06 '13 at 13:46
  • No, it's not non-standard. Just because it's not in the examples you used to learn XPath doesn't make it non-standard. Read the spec. – Michael Kay Oct 06 '13 at 13:47
  • @MichaelKay Homie, the question has been thoroughly edited since I posted this. I left the answer here for other who need sources to learn XPath properly. Don't -1 when you're late at the party. – CodeAngry Oct 07 '13 at 01:50
  • @JensErat My comment to MK goes to you too. Regards! – CodeAngry Oct 07 '13 at 01:53
  • Sorry. This is a real weakness of StackOverflow, that people are allowed to change the question and thus make nonsense of existing answers. Take the downvote as a signal to future readers to mean "don't make use of this answer", not as a criticism of the person who contributed the answer. – Michael Kay Oct 07 '13 at 07:53
  • I'm sorry, too. You also might want to briefly reason your answer in future and somewhat reference the broken query (by fixing it) -- then the edit would have been realized and as is, your answer wouldn't go for a great answer anyway (but no reason for downvote yet). It _could_ even be flagged as "not an answer" as it's mainly a reference and does not attempt to solve the question. – Jens Erat Oct 07 '13 at 08:33