I have a website where I implement multilingual.
I divide my languages per subdomains.
fr-fr.mywebsite.com
es-es.mywebsite.com
www.mywebsite.com // root domain => neutral language for bots
On the subdomains, if a language cookie was not set, I use the subdomain as language code.
On the primary domain (www), if a language cookie was not set, then :
- if it's a bot, I use neutral language
- if it's not a bot, I detect the user language using the "accept-language" header.
How to detect safely if it is a robot? I read old topics on the matter but people simply used the "accept-language" because bots didn't send this header, however, to date, google sends this header...
Is it safer to detect if it's a bot, or inverse, to detect if it's a web browser? Because if the bot is not detected, it's the website that will be indexed in wrong language.
Ideas ?