I am making a PHP scraper and have the following piece of code that grabs the title from the page by looking inside the span uiButtonText
. However I want to now scan for a hyperlink and have it pregmatch <a href="*" class="thelink" onclick="*">(.*)</a>
.
The stars I want to be wild cards so that I can get the hyperlink from the page even if the href and onclick changes for each one.
if (preg_match("/<span class=\"uiButtonText\">(.*)<\/span>/i", $cache, $matches)){print($matches[1] . "\n");}else {}
My Full Code:
<?php
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
$url = "http://www.facebook.com/MauiNuiBotanicalGardens/info";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
$cache = $html;
if (preg_match("/<span class=\"uiButtonText\">(.*)<\/span>/i", $cache, $matches)) {print($matches[1] . "\n");}else {}
?>`