I tried to scrap data from web page using regex but it gives DOM warning. So I want to know, is it possible for regex to scrape date, review, rate value from this page?
Here is with DOM:
https://eval.in/143074 give error.
This works for smaller code : https://eval.in/143036
Is it possible using regex?
<?php
$html= file_get_contents('http://www.yelp.com/biz/franchino-san-francisco?start=80');
$html = escapeshellarg($html) ;
$html = nl2br($html);
$classname = 'rating-qualifier';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
if ($results->length > 0) {
echo $review = $results->item(0)->nodeValue;
}
$classname = 'review_comment ieSucks';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
if ($results->length > 0) {
echo $review = $results->item(0)->nodeValue;
}
$meta = $dom->documentElement->getElementsByTagName("meta");
echo $meta->item(0)->getAttribute('content');
?>