Possible Duplicate:
Robust, Mature HTML Parser for PHP
How to use wikipedia api if it exists?
I'm using YQL to get information from Wikipedia and store it in my private Database. For example I'm scraping this page. I need all the film names from the page. I'm using this code:
HTML:
$.YQL("select * from html where url='http://en.wikipedia.org/wiki/Rajinikanth_filmography' and xpath='/html/body/div[3]/div[3]/div[4]/table'", function (data) {
var str = data.query.results.table.tr;
console.log(str);
$.ajax({
type: "POST",
url: "db.php",
data: {
sendingStr: str
},
success: function(data){
console.log(data);
}
});
});
PHP:
$recv = $_POST['sendingStr'];
$arraySize = count($recv);
for ($i=1; $i < $arraySize; $i++) {
foreach ($recv[$i]["td"][1] as $value) {
foreach ($value as $val) {
if(strlen($val["content"]) >= 3)
{
echo $val["content"] . "\n";
}
}
}
}
Here is my problem- If you notice in the page, each row in the table has several rowspans. But when I scrap it, I'm getting only first value from each row. What should I change in my code so that I get all values?