93

Assume you want to store the locale of user preference in database, which value you will use?

en_US or en-US

They are two standards, but which one you prefer to use as part of your own application?

Updated: Is seems many web sites use dash instead of underscore, e.g.

http://zh.wikipedia.org/zh-tw http://www.google.com.hk/search?hl=zh-TW

dkarp
  • 14,483
  • 6
  • 58
  • 65
Howard
  • 19,215
  • 35
  • 112
  • 184

5 Answers5

110

I'm pretty sure "-" is the standard. If you see "_" somewhere it's probably something some people came up with to make it a valid identifier.

Personally I'd go with "-", just to be correct.

http://en.wikipedia.org/wiki/IETF_language_tag

https://datatracker.ietf.org/doc/html/rfc5646

Nicolas Bouvrette
  • 4,295
  • 1
  • 39
  • 53
Matti Virkkunen
  • 63,558
  • 9
  • 127
  • 159
  • 3
    Yes, "-" is the standard, even I use Java, I will following the standard. (http://www.w3.org/TR/html401/struct/dirlang.html) – Howard Feb 05 '11 at 09:40
  • 8
    @Howard Um, you're going to need the locale names in `en_US` format every time you instantiate a `Locale` object. (For currency formatting, date formatting, etc.) Storing the data in the `en-US` format and replacing dashes with underscores each time you need to use the stored data will absolutely *work*, but it may be wiser (and certainly simpler) to store the locale names in the format your application actually uses... – dkarp Feb 05 '11 at 12:09
  • 16
    @dkarp Using [Locale#toLanguageTag()](http://docs.oracle.com/javase/7/docs/api/java/util/Locale.html#toLanguageTag()) and [Locale#forLanguageTag()](http://docs.oracle.com/javase/7/docs/api/java/util/Locale.html#forLanguageTag(java.lang.String)) will do the trick (in JDK 1.7, though). – viphe Aug 30 '13 at 18:38
  • FYI in my spring based web-application en_US is being used by it developers as dkarp mentioned in his post.. – Lucky Dec 31 '13 at 07:50
  • I use in python `{% set currency = "USD" %}{% set format = "en_US" %}` trying to find if it is correct. – Niklas Rosencrantz Jul 01 '15 at 18:28
  • Programmer 400 that loos like funny python to me. More like jinja2 then python. It all depends on what your using to see if it's correct. – dalore May 24 '16 at 15:23
34

If you're working with Java, you might as well use the Java locale format (en_US).

The BCP 47 documents actually do specify the en-US format, and it's just as common if not more common than Java-style locale names. But in practice you'll see the form with the underbar quite a bit. For example, both Java and most POSIX-type platforms use the underbar for their language/region separator.

So you can't go far wrong with either choice. But given that you're writing in Java and probably targeting a Unix platform, en_US is probably the way to go.

n8felton
  • 322
  • 2
  • 16
dkarp
  • 14,483
  • 6
  • 58
  • 65
  • 1
    Yeah, but he's got a `java` tag on his question. Check out the links in my answer, if you'd like... – dkarp Feb 05 '11 at 02:57
  • Hum, well, you're right about *nix platforms at least. Forgot those. – Matti Virkkunen Feb 05 '11 at 02:59
  • @Matti And you're completely right about the BCP 47 docs (which I noted), but for the questioner's needs the Java-style locale format is probably more appropriate. – dkarp Feb 05 '11 at 03:00
  • As shown, there are many variants but the official standard is underscores. When making calls to frameworks with different conventions, you have a "pratfall" of the coder needing to know a conversion is required. recommend using the convention of the framework you are calling from. When you are making calls to another framework, provide "proxies" that do the conversion. Why? it eliminates the need to know that the called framework uses a different convention. Contributors will ONLY "see" one convention using that one will avoid the pratfalll. – DaBlick Jan 05 '18 at 15:00
  • 1
    Java doesn't support parsing that format, just dumping it out, and the javadoc describes it as for debug purposes not for standars-compliant storage. – Charlie Aug 13 '21 at 18:32
  • You can actually go wrong, and it does go wrong with newer Java versions. The underscores are no longer supported. – SeverityOne Aug 05 '22 at 12:38
  • message bundle file names: use underscore or dash format for newer java? how about java 11? – eastwater Jan 09 '23 at 09:13
16

In Java 7, there is a new method Locale.forLanguageTag(String), which assumes the hyphen as a separator. I'd consider that as normative.

Check the documentation of Locale for more information.

rolve
  • 10,083
  • 4
  • 55
  • 75
user1774051
  • 843
  • 1
  • 6
  • 18
3

en_US. This is a very useful read.

CoolBeans
  • 20,654
  • 10
  • 86
  • 101
  • Sadly, Oracle has deleted or moved that article. – Luke Bayes Sep 20 '12 at 18:55
  • 3
    Yes sorry, downvoting as the article is gone. – ebruchez Sep 20 '13 at 00:47
  • 5
    @ebruchez instead of down voting, you can just leave a comment and I can update the link or remove the answer. This is a 2 and half years old answer. Thanks for catching it though. – CoolBeans Sep 20 '13 at 01:37
  • 9
    With a broken link it deserved a downvote, wouldn't you say? There was no guarantee you would come back to edit it. Re-upvoted now. – ebruchez Sep 20 '13 at 19:04
  • @ebruchez Thanks. I guess my point was that when you have hundreds of posts it's hard to go back and update what link is broken or not. So leaving a comment or better yet edit and updating it would probably be what I would recommend. Thats just my thought process though, not everyone thinks the same way :). – CoolBeans Sep 20 '13 at 19:49
  • 6
    To clarify: the downvote was not a criticism of your initial comment or your abilities, but about the value of the answer at that particular point. And if not downvoting right away, I would have had to 1. add a comment 2. add myself a reminder to downvote the answer later if not addressed. And there was already 1 comment pointing out that the link was incorrect. – ebruchez Sep 23 '13 at 17:30
  • 4
    https://web.archive.org/web/20150604051040/http://docs.oracle.com/javase/7/docs/api/java/util/Locale.html – Benjamin Aug 13 '15 at 11:51
  • In Java implementation it is more lenient and interchangeable with BPC 47 "Well-formed variant values have the form SUBTAG (('_'|'-') SUBTAG)* where SUBTAG = [0-9][0-9a-zA-Z]{3} | [0-9a-zA-Z]{5,8}. (Note: BCP 47 only uses hyphen ('-') as a delimiter, this is more lenient).". So dash if you are following the standard. – saganas Jun 03 '19 at 09:12
  • Whether the link is broken or not, it's not a great answer anyway. The linked JavaDoc contains a lot of info, which parts are relevant to the answer? The article mentions both formats en_US and en-US, but the answer provides no explanation as to *why* one should use en_US over en-US. – Dario Seidl Oct 25 '21 at 16:14
  • It's also an incorrect answer, at least in this day and age. That's why I'm downvoting, because you'll run into problems if you use `en_US` instead of `en-US`. Just today I had to deal with an issue with Nimbus (used by Spring Security 5 for JWT) where it complained about an OpenID Connect configuration with `en_GB` instead of `en-GB`. – SeverityOne Aug 05 '22 at 12:36
0

I don't think en-US is a standard at all for Java. (If you see it somewhere could you add a link).

So just use en_US.

jzd
  • 23,473
  • 9
  • 54
  • 76
  • As shown, there are many variants but the official standard is underscores. When making calls to frameworks with different conventions, you have a "pratfall" of the coder needing to know a conversion is required. recommend using the convention of the framework you are calling from. When you are making calls to another framework, provide "proxies" that do the conversion. Why? it eliminates the need to know that the called framework uses a different convention. Contributors will ONLY "see" one convention using that one will avoid the pratfall. – DaBlick Jan 05 '18 at 14:56
  • 2
    `en-US` is the actual standard, and `en_US` is incorrect. I dealt with this exact issue today, where Nimbus threw an exception because it found `en_GB` in an OIDC configuration instead of `en-GB`. – SeverityOne Aug 05 '22 at 12:38