210

The new Google Chrome auto-translation feature is tripping up on one page within one of our applications. Whenever we navigate to this particular page, Chrome tells us the page is in Danish and offers to translate. The page is in English, just like every other page in our app. This particular page is an internal testing page that has a few dozen form fields with English labels. I have no idea why Chrome thinks this page is Danish.

Does anyone have insights into how this language detection feature works and how I can determine what is causing Chrome to think the page is in Danish?

Samuel Neff
  • 73,278
  • 17
  • 138
  • 182
  • 1
    This is a long shot, but does the page have very few words? Try some other pages that have few words, do they exhibit the same symptom? My guess is there's a configuration somewhere on the server that sets the locale to danish, and because there are not enough words on the page to determine the language, chrome just goes with the server's assumption. – hasen Mar 31 '10 at 06:57
  • 1
    See also http://stackoverflow.com/questions/2980520/chrome-translate – dreeves Jun 07 '10 at 00:32
  • 7
    Norweigian Bokmal here. I used the word 'Barf' on a few buttons. I changed the word to 'Bounce' and now Chrome thinks it's Dutch. Whaaaaaat? – thomas-peter Sep 07 '12 at 08:57
  • @thomas-peter Dutch guy here. 'Barf' is not even a Dutch word that I ever heard of! Also no idea why Google thinks it's Dutch :p – Stijn de Witt Sep 02 '20 at 14:52

6 Answers6

251

Update: according to Google

We don’t use any code-level language information such as lang attributes.

They recommend you make it obvious what your site's language is. Use the following which seems to help although Content-Language is deprecated and Google says they ignore lang

<html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml">
<meta charset="UTF-8">
<meta name="google" content="notranslate">
<meta http-equiv="Content-Language" content="en">

If that doesn't work, you can always place a bunch of text (your "About" page for instance) in a hidden div. That might help with SEO as well.

EDIT (and more info)

The OP is asking about Chrome, so Google's recommendation is posted above. There are generally three ways to accomplish this for other browsers:

  1. W3C recommendation: Use the lang and/or xml:lang attributes in the html tag:

    <html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml">
    
  2. UPDATE: previously a Google recommendation now deprecated spec although it may still help with Chrome. : meta http-equiv (as described above):

    <meta http-equiv="Content-Language" content="en">
    
  3. Use HTTP headers (not recommended based on cross-browser recognition tests):

    HTTP/1.1 200 OK
    Date: Wed, 05 Nov 2003 10:46:04 GMT
    Content-Type: text/html; charset=iso-8859-1
    Content-Language: en
    

Exit Chrome completely and restart it to ensure the change is detected. Chrome doesn't always pick up the new meta tag on tab refresh.

Kyle Cureau
  • 19,028
  • 23
  • 75
  • 104
  • 1
    Here's a description of Google's meta tags: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=79812 – Joshua Davis Jun 29 '12 at 20:30
  • @RickM, no? This is from Google: "If you're a webmaster and would prefer your web page not be translated by Google Translate, just insert the following meta tag into your HTML file: " See: http://support.google.com/translate/ – Kyle Cureau Jul 04 '12 at 19:07
  • Nope, definitely not working on both Windows and Mac versions of chrome. Seems that a number of people cant get it to work...seems a bit hit and miss! – Sk446 Jul 10 '12 at 08:06
  • 7
    @Emile: It works, if you load the page in a new tab. It doesn't work if you just press F5 to refresh. – Stefan Steiger Aug 23 '12 at 08:32
  • 1
    In html5 it should be content instead of value: – roeland Nov 28 '12 at 08:35
  • 1
    Setting the correct response headers would be preferable over http-equiv meta tags. – Ja͢ck Jan 26 '13 at 21:26
  • 1
    @Jack, that's neither the recommendation of Google or the W3C. Although your challenge did turn up interesting info which called my answer into question: http://www.w3.org/International/tests/html-css/language-declarations/results-language-declarations – Kyle Cureau Jan 27 '13 at 21:51
  • Interesting results, especially since the meaning of http-equiv should mean it work the same as http response header. – Ja͢ck Jan 27 '13 at 22:02
  • 3
    Chrome seems to do whatever it wants. I can return txt files in english specifying that the are ASCII in the HTTP response headers, and even if the data only contains ASCII characters, chrome still does a frequency analysis on the bytes and prompts the user that it is in a different language. – Myforwik Oct 07 '14 at 02:33
  • I'm having the same issue, even so `` is set, Chrome thinks it's English. – Andy Jun 13 '16 at 18:52
  • Don't work on Chrome or Search, but I imagine the difficulty is that you are using meta tags supported by the latter to tell the former what to do. Interestingly, [these folks are having the opposite problem](https://bugs.chromium.org/p/chromium/issues/detail?id=586053). – Kevin Jun 29 '16 at 14:32
  • All the above didn't work for me. I tried a lot of tricks but Chrome is really stubborn in trusting itself to detect the language correctly, no matter how many language attributes and headers you set. What helped was to add like 10 extra words in English and it stopped asking to translate my page from Portuguese to English. – Jørgen Feb 21 '17 at 09:59
  • 1
    Note that you have to hard reload (or close and reopen chrome) before the translate message will disappear; not just a normal reload. – Carter Medlin Aug 29 '18 at 23:08
  • Just confirming, I had to close chrome entirely for this change to be picked up correctly. A new tab or page reload isn't enough. – Matt Sanders Apr 11 '19 at 19:03
  • @Ja͢ck actually I'm pretty certain the standards bodies agree that HTTP headers are preferred over HTML meta tags *in general*. Not sure about the Translation toolbar though. But for certain things it's clear why the HTTP response headers are generally preferred. These work for *all* HTTP resources, not just HTML and they are not dependent on the response body. Think about the contradiction of reading content type and charset info *from the content*... you need to know charset to parse content to find charset.. which is why the meta tag for Content-Type must be near the top of the page. – Stijn de Witt Sep 02 '20 at 14:59
  • Jack, it was intended for @Kyle but I had two at mentions and only one was allowed and I removed the wrong one. Sorry. – Stijn de Witt Sep 02 '20 at 15:00
  • I had to put `lang="en-US"`, `lang="en"` didn't stop it. I realized it was offering latin because I had a bunch of Lipsum in an example page. – John Baber-Lucero Apr 21 '21 at 20:19
  • Updates to hack around chrome are normally outdated as fast as they are posted. Is there anything left in this post that still works? – run_the_race Aug 13 '21 at 11:10
15

I added lang="en" to the doctype declaration, added meta tags for charset utf-8 and Content-Langauge in the HTML header, specified charset as utf-8 and Content-Language as en in the HTTP response headers and it did nothing to stop Chrome from declaring my page was in Portuguese. The only thing that fixed the problem was adding this to the HTML header:

<meta name="google" content="notranslate">

But now I've prevented users from translating my page that is clearly in English to their own language. Poor job, Chrome. You can be better than this.

Chris Broski
  • 2,421
  • 27
  • 23
  • 7
    So true! They say *'We don’t use any code-level language information such as lang attributes'*. Yeah, because that would be weird. Instead, we use some secret/proprietary magic algorithm. When IE did this for determining Content-Type, we said they did not follow standards, but when we do it, suddenly it's great. Yay! – Stijn de Witt Sep 02 '20 at 15:04
5

Specify the default language for the document, then use the translate attribute and Google's notranslate class per element/container, as in:

<html lang="en">
    ...
    <span><a href="#" translate="no" class="notranslate">English</a></span>

Explanation:

The accepted answer presents a blanket solution, but does not address how to specify the language per element, which can fix the bug and ensure your page remains translatable.

Why is this better? This will cooperate with Google's internationalization versus shut it off. Referring back to the OP:

Why does Chrome incorrectly determine page is in a different language and offer to translate?

Answer: Google is trying to help you with internationalization, but we need to understand why this is failing. Building off of NinjaCat's answer, we assume that Google reads and predicts the language of your website using an N-gram algorithm -- so, we can't say exactly why Google wants to translate your page; we can only assume that:

  1. There are words on your page that belong to a different language.
  2. Marking the containing element as translate="no" and lang="en" (or removing these words) will help Google to correctly predict the language of your page.

Unfortunately, most people reaching this post won't know what words are causing the trouble. Use Chrome's built-in "Translate to English" feature (in the Right-Click context menu) to see what gets translated, you may see unexpected translations like the following:

enter image description here

So, update your html with the appropriate translation tags until the Google Translation of your page changes nothing -- then we should expect the popup to go away for future visitors.

Won't it be a lot of work to add all these extra tags? Yes, very likely. If you are using Wordpress or another Content Management System then look in their documentation for quick ways to update your code!

Design.Garden
  • 3,607
  • 25
  • 21
2

Without knowing what the text was, perhaps the ngram detection is being tricked by the content of your page.

http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html

https://en.wikipedia.org/wiki/N-gram

Krenair
  • 570
  • 5
  • 21
NinjaCat
  • 9,974
  • 9
  • 44
  • 64
  • 2
    But the question is, how can I debug it or get more info for Chrome to figure out exactly why it made the choice it did? – Samuel Neff Jul 24 '10 at 04:04
  • 2
    Without seeing the text, I cannot say for sure. Some things to try: - If you copy the text and paste it into translate.google.com, and set it to "Detect Language", does it tell you that it's English or not? - If it says it's Danish or whatever, then I would start removing sentences until you find the troublemaker. – NinjaCat Jul 24 '10 at 06:54
  • Hi Sam - That's in effect what I am suggesting. There's no way to ask it why it made the decision. There's some sentence or wording in your text that is tricking it (after all machine translation is not nearly perfect). In order to debug this thing I would take out sentence by sentence until it recognizes the correct language. – NinjaCat Jul 25 '10 at 08:31
1

Chromium thinks this page in Filipino: http://www.reyalvarado.com/portfolio/cuba/ Notes: There is pretty much no text on the page except for the owner's name and the menu items. Menu items are dynamically replaced with images by FLIR.

The HTML declares the page as US English:

<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US"> 
0

Try including the property xml:lang="" to the <html>, if the other solutions don't work:

<html class="no-js" lang="pt-BR" dir="ltr" xml:lang="pt-BR">
Peter O.
  • 32,158
  • 14
  • 82
  • 96
Alan
  • 9
  • 1