I am using the PHPCrawl for a website I would like to receive the data from, but I do not know where to start with retrieving data from (eg) a span with a specific class.
per example I would like to retrieve the name "Jan" from this span:
<span class="firstname">Jan</span>
I have tried using DOMDocument() and DOMXPath() but I get errors when the loading the html string.
So here is what I had so far:
$doc = new DOMDocument();
$doc->loadHTML($PageInfo->content);
$xpath = new DOMXPath($doc);
foreach ($xpath->query("//span[@class='family-name']") as $node) {
echo "Family name: " . $node . "\n";
}
However using this will give errors like these:
PHP Notice: DOMDocument::loadHTML(): Namespace prefix g is not defined in Entity, line: 294 in /var/www/crawl/www/crawl.php on line 30
PHP Warning: DOMDocument::loadHTML(): Tag g:plusone invalid in Entity, line: 294 in /var/www/crawl/www/crawl.php on line 30
Since I cannot change the html code (this is extracted by PHPCrawl) I need to do something else. However I don't know what. Does PHPCrawl itself has any tools for doing so?