I am trying to learn how to use xpaths in web scraping. One of the things I'm tryingto do is get all the data from a table element and echo it to the screen. I created a text html document:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Insert title here</title>
</head>
<body>
<table>
<tbody>
<tr>
<td>
This is table Data 1
<a href="this/is/href1">
<img src="/this/is/src1_.jpg">
</a>
</td>
<tr>
<td>
This is table Data 2
<a href="this/is/href2">
<img src="/this/is/src2_.jpg">
</a>
</td>
<tr>
<td>
This is table Data 3
<a href="this/is/href3">
<img src="/this/is/src3_.jpg">
</a>
</td>
</tr>
</tbody>
</table>
</body>
</html>
I am having problems with my xpath query and then iterating through the returned data. I want to display the elemts and the elements attributes as if it were html. The xpaths that I have tried for getting the table data are:
$node = $xpath->query("/html/body/table");
$node = $xpath->query("/html/body/table/child::node()");
To try to iterate through the nodeList I'm using a for loop as suggested on http://php.net/manual/en/domxpath.query.php
for ($i = 0; $i < $node->length; $i++) {
echo "Node Item: " . $node->item($i)->nodeValue . "<br>";
}
The output:
Node Item: This is table Data 1 This is table Data 2 This is table Data 3
How do I go about getting the anchor and image tags along with the href and src?