16

When instantiating a Locale object with either one of the following language codes: he, yi and id it doesn't preserve their value.

For example:

Locale locale = new Locale("he", "il");
locale.getLanguage(); // -> "iw"

What is causing this and is there any way to work around this?

Alex Ciminian
  • 11,398
  • 15
  • 60
  • 94

2 Answers2

19

The Locale class does not impose any checks on what you feed in it, but it swaps out certain language codes for their old values. From the documentation:

ISO 639 is not a stable standard; some of the language codes it defines (specifically "iw", "ji", and "in") have changed. This constructor accepts both the old codes ("iw", "ji", and "in") and the new codes ("he", "yi", and "id"), but all other API on Locale will return only the OLD codes.

Here's the constructor:

public Locale(String language, String country, String variant) {
    this.language = convertOldISOCodes(language);
    this.country = toUpperCase(country).intern();
    this.variant = variant.intern();
}

And here's the magic method:

private String convertOldISOCodes(String language) { 
    // we accept both the old and the new ISO codes for the languages whose ISO 
    // codes have changed, but we always store the OLD code, for backward compatibility 
    language = toLowerCase(language).intern(); 
    if (language == "he") { 
        return "iw"; 
    } else if (language == "yi") { 
        return "ji"; 
    } else if (language == "id") { 
        return "in"; 
    } else { 
        return language; 
    }
}

The objects it creates are immutable, so there's no working around this. The class is also final, so you can't extend it and it has no specific interface to implement. One way to make it preserve those language codes would be to create a wrapper around this class and use that.

Alex Ciminian
  • 11,398
  • 15
  • 60
  • 94
  • 3
    The funny part is that `he`, `yi` and `id` are standard codes (according to Wikipedia - http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes), unlike their replacements. – Adrian Ber Feb 20 '13 at 16:29
  • 1
    ICU's ULocale appear to handle "he" correctly at least. (By "correctly", I mean "leaving it alone".) So depending on what you're using it for, that can be a workaround. – Hakanai Jul 31 '15 at 05:20
2

The Java treatment of the Hebrew locale seems to had been changed in Java 17. It appears as an attempt to adhere to the ISO_639-1 language codes standard.

Unless property 'java.locale.useOldISOCodes' is set to true, Java now treats the Hebrew locale, by default as 'he' in adherence with ISO_639-1. This means you will succeed to load a Hebrew resource bundle named 'messages_he.properties' with either 'iw' or 'he' language code constructed locales. A 'messages_iw.properties' resource is de-prioritized and will only get loaded if a corresponding 'he' resource is none existent.

It's a step in the right direction and it's better late than never, as no more trickery and magic is required in the naming strategy of Hebrew resource bundles. Just use the 'he' ISO code.

I've recently answered this here at Locale code for Hebrew / Reference to other locale codes?. I've provided a small example class with basic resource bundles which demonstrates the new behavior.

Tom Silverman
  • 634
  • 1
  • 8
  • 7