2

We have an intranet application that's predominantly english, but serves up fragments of content in various unspecified languages.

Chrome used to detect foreign content and offer to translate it for us: if we had lots of foreign content, it offered to translate the whole page; if we only had a little foreign content, the user was able to select the text and use the 'Translate to English' context-menu. Both options are currently broken as (apparently) Chrome is now accepting the Content-Language header as gospel, and disables the translation context-menu.

Our server puts out response headers including:

Content-Language: en-US

This is correct as the UI is all English. However, we need some way to mark areas of the page where the content may not be English. e.g. comment areas.

What lang value can we use to mark-up a section of a page as 'may not be english', without knowing exactly what language it is?

<div lang="??">
  <p>...customer comments...</p>
</div>
searlea
  • 8,173
  • 4
  • 34
  • 37

2 Answers2

1

By the HTML5 CR, the way to declare that some content is in an unknown language is to use the lang attribute with an empty string as its value, i.e. lang="".

Whether Chrome recognizes this is a different issue. In general, the quality of machine translation is so low that disabling it is usually a good thing.

Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • Unfortunately Chrome doesn't currently support this, but it's the right answer. It looks like we'll have to wait for a few more of Chrome's [content-language decision bugs](https://code.google.com/p/chromium/issues/detail?id=173999) to be resolved. – searlea Feb 22 '13 at 07:55
0

If I recall correctly (from Multilingual Web Workshop), the Content-Language should no longer be used. Instead you should use attributes lang and xml:lang.

With this is mind, the answer is pretty straightforward: do use the lang attribute for things you know for sure that will be in English, leaving other things untouched.
This of course means that you cannot use lang tag for global scope (i.e. as <html lang="en">), but I really can't help it.

BTW. It is sort of possible (although not easy) to mark dynamically generated portions with appropriate lang. For example, one may consider forcing user to select valid language from the list, or use some library to detect language from text. In this case, you would know language at the rendering time, which I believe is sufficient.

Paweł Dyda
  • 18,366
  • 7
  • 57
  • 79
  • A lot of our content comes from data files with no markup to indicate language. We sometimes know the country, but that's not enough - e.g. comments from Brazil can come in Spanish or Portugese. From Germany we (strangely) get mostly English comments with only the occasional German. – searlea Feb 19 '13 at 20:42
  • @searlea: That's why I mentioned library to detect the language. For example, one may use [ICU's CharsetDetector](http://icu-project.org/) for that purpose. It might sound strange at first, but really this detects language-encoding pair. Several other libraries exist as well. – Paweł Dyda Feb 19 '13 at 20:54