I have a website status checker that writes the latest urls checked to a log file (url, status e.g. up or down and date checked), trouble i'm now finding is that it also records spider/Google bot visits, so latest site checks are being written multiple times per second...
Here is my log writing function:
public function log($url, $status) {
if (strpos($url, "/") !== false):
if (strpos($url, "http://") === false):
$url = "http://" . $url;
endif;
$parse = parse_url($url);
$url = $parse['host'];
endif;
if (!empty($url)):
$arrayToWrite = array(
array(
"url" => $url,
"status" => $status,
"date" => date("m/d/Y h:i")
)
);
if (file_exists($this->logfile)):
$fileContents = file_get_contents($this->logfile);
$arrayFromFile = unserialize($fileContents);
foreach ($arrayFromFile as $k => $tmpArray):
if ($tmpArray['url'] == $url):
unset($arrayFromFile[$k]);
endif;
endforeach;
if (is_array($arrayFromFile)):
array_splice($arrayFromFile, 9);
$arrayToWrite = array_merge($arrayToWrite, $arrayFromFile);
endif;
endif;
file_put_contents($this->logfile, serialize($arrayToWrite));
endif;
}
What type of amendments could I make so it ignores bots/spider visits please so it only tracks/writes real visitors?