0

Due to API version changes, we are now receiving ISO 639-1 (2 letter codes e.g. "en", "fr", "es") instead of ISO 639-2 (3 letter codes e.g. "eng", "frm", "spa"). I found in JAVA we can use locales and use getISO3Language() method to get 3 letter codes from 2 letter codes.

Client still needs 3 letter codes and having them change their code is not an option... :(

However for codes like "frm" do not have a 2 letter version

fre (B)fra (T) | fr | French
-- | -- | -- | 
fre (B)fra (T) | fr | French
frm |   | French, Middle (ca.1400-1600)
fro |   | French, Old (842-ca.1400)

I know that "frm" and "fra" are synonym codes, can they be used interchangbly? If not, is there a way to get "frm" from ISO 639-1 "fr" code?

Also, Java does have the ability to use subtags from new IANA Language Subtag Registry

But again, how to get the correct ISO 639-2 code from the 2 letter code?

Tarnished-Coder
  • 318
  • 2
  • 7
  • 1
    fra and frm are different languages. The latter is "French, Middle (ca.1400-1600)", and this has no 2 letter value. You could approximate, but better not to do it. – Giacomo Catenazzi Aug 26 '20 at 13:46
  • thanks, @GiacomoCatenazzi. Would you know a way to either generate or get the "frm" value from any other source (or a library, i could use) e.g. IANA language subtags? or would it be better for me generate a dictionary/map of 3 letter codes for the 2 letter codes and send them upstream? – Tarnished-Coder Aug 27 '20 at 15:36
  • I do not understand. "IANA" uses 2 or 3 letter codes, and frm is allowed as primary language, and it should be used if we are writing `frm`. [https://tools.ietf.org/html/rfc5646], and you should always use 2 letter if available. Wikipedia "List_of_ISO_639-1_codes" has the translation between 3 to 2 letter if available. frm is not fr so it should be keep as 3 letters. Unicode should have good databases (the subproject about locale registry). I assume Java (and other languages) automatically construct their db from IANA or Unicode. – Giacomo Catenazzi Aug 28 '20 at 06:00
  • So, I would not transform frm to fr (and frm is allowed by IANA, so by Internet/HTML/....), but I would transform fra/fre to fr: wikipedia is the simple way (or your Java registry). Unicode has for sure data in some of their databases. Again: if there is no 2 country code: do not transform it. Else: do a db manually [do not publish with original database, but ev. as annex (for special purposes). Possibly you may create it automatically with language family, but I'm not sure. Manual is probably the best way, considering the exceptions on all human creations. – Giacomo Catenazzi Aug 28 '20 at 06:05

0 Answers0