5

I have the following function to get the last access date of googlebot:

//get googlebot last access
function googlebot_lastaccess($domain_name)
{
    $request = 'http://webcache.googleusercontent.com/search?hl=en&q=cache:'.$domain_name.'&btnG=Google+Search&meta=';
    $data = getPageData($request);
    $spl=explode("as it appeared on",$data);
   //echo "<pre>".$spl[0]."</pre>";
    $spl2=explode(".<br>",$spl[1]);
    $value=trim($spl2[0]);
   //echo "<pre>".$spl2[0]."</pre>";
    if(strlen($value)==0)
    {
        return(0);
    }
    else
    {
        return($value);
    }      
} 

echo "Googlebot last access = ".googlebot_lastaccess($domain_name)."<br />"; 

function getPageData($url) {
 if(function_exists('curl_init')) {
 $ch = curl_init($url); // initialize curl with given url
 curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // add useragent
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
 if((ini_get('open_basedir') == '') && (ini_get('safe_mode') == 'Off')) {
 curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
 }
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); // max. seconds to execute
 curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
 return @curl_exec($ch);
 }
 else {
 return @file_get_contents($url);
 }
}

But this script prints me as result the snapshot of the whole page in screen, ie. the whole page cached in google but I want to capture only the date time after words as it appeared on and print it ie.: 8 Oct 2011 14:03:12 GMT.

How to?

Kara
  • 6,115
  • 16
  • 50
  • 57
grigione
  • 697
  • 2
  • 11
  • 37

2 Answers2

5

Change this line:

echo "Googlebot last access = ".googlebot_lastaccess($domain_name)."<br />";

with this:

$content = googlebot_lastaccess($domain_name);
$date = substr($content , 0, strpos($content, 'GMT') + strlen('GMT'));
echo "Googlebot last access = ".$date."<br />"; 
Aurelio De Rosa
  • 21,856
  • 8
  • 48
  • 71
  • mmm problem google ban me Our systems have detected an unusual traffic from your computer from the network. Any soft solution ? – grigione Oct 14 '11 at 10:50
  • @grigione Usually they ban you for a while. Don't abuse Google doing a too request. It is prohibited by their license of use. – Aurelio De Rosa Oct 14 '11 at 10:55
3

Why query Google as to when it was last at your site when you can detect Googlebot on your site and what pages its on? It will also allow you to track where Googlebot went with a simple write to database function.

See Stack Overflow question how to detect search engine bots with php?

Community
  • 1
  • 1