2

Take a sentence like this:

"I visited this place {0} {1} ago."

While {0} stands for an integer, {1} is the word "year" or "years" respectively. Now, in Russian, the word "year" is not just singular or plural, but depends on the exact number instead (год-года-лет). So, any rule just differentiating between "year" or "years" is insufficient for Russian.

Now, the information I need is this: is there any way to add rules in the resource bundle or in the source code, keeping the entire string, or do I have to split up the string into

"I visited this place {0} " + "{1} " + "ago."

... expanding the rule in the source code? How do you handle problems like this? Is there any best practice?

James Donnelly
  • 126,410
  • 34
  • 208
  • 218
  • Your title's misspelled and you've given no indication of what platform and language you're using, therefore "resource bundle" could mean anything. – KomodoDave Feb 11 '13 at 12:37

2 Answers2

3

The golden rule of i18n

Don't produce localized output by concatenating localized strings. Ever. For any reason.


Here you are violating this rule by inserting the localized form of "year/years" midway through a larger piece of text. For your specific example, this is easy to work around -- just localize "N year(s)" as a whole and insert that one instead -- but that would not really solve the problem. There are languages with structures even more dependent on context where this approach would break down fatally at some point.

For best results you should localize the string as a whole. For the Russian locale the string should have 3 different forms depending on the value of the "years" parameter (I don't know Russian, so no idea which form would be used for what values).

I 'm not sure what i18n technologies you are using, but gettext (which the question is tagged with) supports this out of the box.

Jon
  • 428,835
  • 81
  • 738
  • 806
1

To some extent you already answered your question. You should not concatenate strings. Basically, placeholders could be used for numbers, dates and dynamic text.
I would argue that the unit of measurement (time in this case) is not dynamic text.

How can you resolve this problem?

I'll give you some basic blueprints of two ideas. Both require using full sentences.

  1. You can re-arrange the sentence, so that you don't have the problems with plurality, i.e. "The place has been visited this number of years|months|days|hours|minutes ago: {0}".
    This has obvious drawbacks and does not sound naturally. And although I can't give you an example of language where this concept won't work, there is a non-zero probability that such language exists (Slavic languages are not among them, that one is for sure.)

  2. Use some rules-based selection method to select valid plural form from resource files. To do that you need to know just a bit of Language Plural Rules. Basically, you can use these CLDR's rules on your own or you can decide on something else, like wrapping ICU4C's PluralRules class and use its select method to well, select valid plural form.
    The ICU Project site even lists existing wrappers that you can use in your C# application, namely GenICUWrapper and ICU-Dotnet.

Personally, I'd recommend latter method (with ICU wrapper). You may want to see my answer regarding similar problem with solution in Java. I believe .Net's would be based on the same idea, only you would use string.Format() instead of MessageFormat and you would read the resources in .Net's way (whatever style you actually prefer).

decocijo
  • 908
  • 7
  • 19
Paweł Dyda
  • 18,366
  • 7
  • 57
  • 79
  • One more thing (does not really belong to my answer). As @Jon mentioned, [gettext](http://code.google.com/p/gettext-cs-utils/) supports plurality out of the box, but personally I don't think that you meant to use the gettext tag in that context. – Paweł Dyda Feb 11 '13 at 17:08
  • Dear Pawel, really impressed, thank you! Another interesting thing to know: is there any case rule as to when you use the ICU library and when gettext? Is ICU more closely directed towards object-oriented languages and gettext to C? I, by the way, apologize for being vague / inaccurate - I'm a translator, not a software engineer. Alexander – user2061221 Feb 13 '13 at 15:02
  • @user2061221: It just depends on your planned support for i18n. ICU is more generic, that is it offers much more than just handling localizable strings and message formatting. However, if you don't need additional features, gettext might be more natural choice. Although, to be honest, I would not use it with .Net, which has its own mechanism for string externalization (and limited support for message formatting). – Paweł Dyda Feb 13 '13 at 17:07
  • Another question: @Paweł Dyda demonstrated how you can control dynamic dependencies through the source code using the select method: [http://stackoverflow.com/questions/14326653/java-internationalization-i18n-with-proper-plurals/14327683#14327683] This is java code. Does anybody know to which extend this works with other languages, too, especially with C, C++, C#, Delphi, Objective C, Pascal, Perl, PHP, Python, Ruby? ICU and GNU Gettext do have wrappers for most of them, but do these languages offer the required functionality? Alexander – user2061221 Mar 07 '13 at 15:20