2

Is there a way to detect if the page is visited by a bot?

I tried checking the $_SERVER['HTTP_USER_AGENT'] is within an array. It works fine.

$bot = array("Slurp", "Scooter", "URL_Spider_SQL", "Googlebot", "Firefly", "WebBug", "WebFindBot", "crawler",  "appie", "msnbot", "InfoSeek", "FAST", "Spade", "NationalDirectory",);

if (in_array($_SERVER['HTTP_USER_AGENT'], $bot)) {
    return true;
}
else {
return false;
}

Is there a better and secured way to do this? (other than having to type-in all the bot names?) What's the difference between my method and this?

Community
  • 1
  • 1
Sid
  • 1,255
  • 2
  • 22
  • 45
  • 1
    There isn't much difference between the solutions except that the other one 1) isn't an array 2) makes sure that the value from HTTP_USER_AGENT is lowercase and compares to a lowercase value. Personally I would go with yours but add the strtolower method to it, and add all the spiders / bots as lowercase. – Dale Nov 30 '12 at 10:31
  • Just one little notice at this point: Relying on HTTP_USER_AGENT is not secure as this string can be set to anything you want. There are also lists of many bot useragents available. Maybe use one of these instead of building your own list. – Alex2php Nov 30 '12 at 10:52

2 Answers2

2

Well, after some digging inside the Google I found this.

$agent = strpos(strtolower($_SERVER['HTTP_USER_AGENT']));
foreach($bots as $name => $bot)
{
    if(stripos($agent,$bot)!==false)
    {
        return true;
    }
    else {
        return false;
    }
}

Thanks for the support Dale!!

Sid
  • 1,255
  • 2
  • 22
  • 45
0

Looking the Sid's answer, and googling i found on this site other way to detect. look:

function detect_is_bot () {
    $bots = array("Slurp", "Scooter", "URL_Spider_SQL", "Googlebot", "Firefly", "WebBug", "WebFindBot", "crawler",  "appie", "msnbot", "InfoSeek", "FAST", "Spade", "NationalDirectory",);
    $agent = strtolower($_SERVER['HTTP_USER_AGENT']);
    foreach($bots as $bot) {
        if(stripos($agent,$bot)!==false) {return true;}
    }
    return false;
}