2

I have a website where I implement multilingual.

I divide my languages per subdomains.

fr-fr.mywebsite.com
es-es.mywebsite.com
www.mywebsite.com // root domain => neutral language for bots

On the subdomains, if a language cookie was not set, I use the subdomain as language code.

On the primary domain (www), if a language cookie was not set, then :

  • if it's a bot, I use neutral language
  • if it's not a bot, I detect the user language using the "accept-language" header.

How to detect safely if it is a robot? I read old topics on the matter but people simply used the "accept-language" because bots didn't send this header, however, to date, google sends this header...

Is it safer to detect if it's a bot, or inverse, to detect if it's a web browser? Because if the bot is not detected, it's the website that will be indexed in wrong language.

Ideas ?

Ndrou
  • 169
  • 1
  • 5
  • 1
    Why not use language annotations? in that way the bot will find the alternate languages pages – Cesar Sep 22 '16 at 16:55
  • I use them too. But the primary domain have to auto detect user language :) – Ndrou Sep 23 '16 at 07:07
  • 1
    Hi @Ndrou, I still not understand why you need to find if the user is a bot, if the request has a valid "accept-language" header you can send it to the proper language site, and if not, to your main or default language site, if is a bot, he will be able to find all the altenate languages using the language annotations and index them too – Cesar Sep 23 '16 at 16:10

1 Answers1

1

Assuming you're using PhP, you can request the HTTP_USER_AGENTand see if the user agent is 'googlebot'.

if(strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot"))
{
    // what to do
}

Here's the link to a question (and the example which I pulled from it).

how to detect search engine bots with php?

Community
  • 1
  • 1
Matthew
  • 3,136
  • 3
  • 18
  • 34
  • Yes, but there is not only googlebot, there are many bots as yahoo, bing, yandex, etc... How to be sure not to forget one ? – Ndrou Sep 23 '16 at 07:06
  • You can add in all those bots name, just google and look up the bot name. You can also add a log and just log the `HTTP_USER_AGENT` value and sort through the list to see if a bot-like name has popped up. All the well-known search engines and legitimate will name their bot. – Matthew Sep 26 '16 at 01:44