Questions tagged [ietf-bcp-47]

Use this tag for questions related to the handling of identifiers ("tags") for spoken and written languages, as handled in a programming context. Specifically, this tag is for identifiers which conform to the IETF's BCP 47 document "Tags for Identifying Languages".

Overview

The IETF's BCP 47 document is a "best current practices" document for the identification of written and spoken languages, through the creation of language tags.

The document:

"specifies a particular identifier mechanism (the language tag) and a registration function for values to be used to form tags."

Basic Examples

fr - French
fr-CA - Canadian French
es-419 - Spanish as used in Latin America and the Carribbean
zh-Hant - Chinese written using Traditional Han script

Structure Overview

BCP 47 language tags have a flexible structure which can contain the following subtags, separated by dashes:

language-extlang-script-region-variant-extension-private

The language subtag is mandatory and must come first. Its values are taken from ISO 639 language codes.

The extlang (extended language) subtag can be used to provide more specificity - for example, cmn for Mandarin in zh-cmn (Mandarin Chinese).

The script subtag can be used to make a distinction between different written formats of a language (for example, Hant vs. Hans for traditional vs. simplified Chinese).

The region subtag can be a country code (CA in fr-CA) or a UN M.49 region code (419 in es-419).

The variant subtag can provide a finer-grained definition for dialects and scripts. This is not typically needed in most common usages.

The extension and private subtags can be used for further customized language data.

Resources

The specification document
W3C Guide to Language Tags
Unicode CLDR guide to Picking the Right Language Identifier
Language tag lookup and validation service

16 questions

votes

7 answers

Getting the user's region with navigator.language

For some time, I've been using something like this to get my user's country (ISO-3166): const region = navigator.language.split('-')[1]; // 'US' I've always assumed the string would be similar to en-US -- where the country would hold the 2nd…

asked Aug 29 '16 at 19:35

Jeff

2,293
4
26
43

votes

1 answer

How to I get the IETF BCP47 Language code in Android API < 21

Is there a clever way to get the BCP47 language code in Android for APIs less than 21? In API level 21+ the Locale.toLanguageTag is exactly what I need. How would you get this in lower API levels?

android locale ietf-bcp-47

asked Apr 15 '15 at 18:16

superdave

1,928
1
17
35

votes

2 answers

How to convert IETF BCP 47 language identifier to ISO-639-2?

I am writing a server API for an iOS application. As a part of the initialization process, the app should send the phone interface language to server via an API call. The problem is that Apple uses something called IETF BCP 47 language identifier in…

python ios iso-639-2 ietf-bcp-47

asked Sep 28 '14 at 13:53

Adam Matan

128,757
147
397
562

votes

0 answers

IANA time zone ID to BCP-47 using ICU4C

Given an IANA time zone ID, such as "America/New_York" or "Europe/Lisbon", how can I obtain the corresponding BCP-47 time zone ID, such as "usnyc" or "ptlis", using ICU4C? These values are required to generate Unicode BCP-47 Locale IDs with…

timezone icu cldr ietf-bcp-47 icu4c

asked Jun 03 '19 at 22:18

kpozin

25,691
19
57
76

vote

0 answers

Are all combinations of language codes and regions in the language-subtag-registry valid?

RFC 5646 (https://www.rfc-editor.org/rfc/rfc5646.html) and IANA language subtag registry (https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry) describe and list the language and region codes that make up the tags for…

lang iana ietf-bcp-47 rfc5646

asked Jan 14 '23 at 03:30

Dave Cherkassky

vote

1 answer

Taiwanese language and country codes

I'm a bit uncertain between the two variations below: zh-cht and zh-tw - it's for a site in traditional Chinese, mostly in Taiwan, but presence in Maccao and Hong Kong. So zh-cht and zh-tw seem to represent the same language. Possibly their are…

country-codes ietf-bcp-47

asked Nov 09 '21 at 20:58

Rogelio

vote

2 answers

Is there a list of BCP 47 language codes in R?

I'm running the fantastic pandoc from within an R package, relying on the LaTeX babel package for some typesetting niceties. Pandoc expects a lang argument as a BCP 47 code (e.g. en-US), but babel expects its own language codes (e.g.…

r latex multilingual pandoc ietf-bcp-47

asked Feb 20 '18 at 14:08

maxheld

3,963
2
32
51

vote

1 answer

AAPT ERROR: Invalid BCP 47 tag in directory name b+sr+latn_values

I am trying to run a command via aapt to test out the functionality. ./aapt package -f --no-crunch -M /home/username/AndroidStudioProjects/ProjectName/androidTest/src/main/AndroidManifest.xml -I…

android gradle aapt ietf-bcp-47

asked Nov 25 '17 at 13:32

jgm

1,230
1
19
39

vote

1 answer

Do I need hreflang x-default and can I've multiple hreflang point to same URL?

Based on the Google info about hreflang, I came up with this but I've the en and default point to same URL instead of having another en/. Will that be fine? I don't want to create another folder as it require additional maintenance. Basically, the…

seo multilingual ietf-bcp-47

asked Jul 27 '14 at 18:51

sparkmix

2,157
3
25
33

votes

1 answer

Get exact language object from display name

Using langcodes package, how do I obtain the exact language object from the display name? For example, langcodes.find("English (United Kingdom)") returns Language.make(language='en') instead of returning Language.make(language='en', territory='GB')…

python localization ietf-bcp-47 iso-639

asked Jul 09 '23 at 16:10

Anm

votes

0 answers

How can I get the name of a language in any other language, based on IETF language tag

I'm looking for a way to get the name of a language in any other language, based on a IETF language tag. For example: I have a list of IETF language tags ('en', 'fr', 'nl', 'de' , ...) and I want them mapped to the language display name of my…

json ietf-bcp-47

asked Mar 26 '23 at 21:33

Renaat De Muynck

3,267
1
22
18

votes

0 answers

How to normalize semantically same language tags? [cldr]

I am currently browsing the cldr-common-42 database and I find the use of language tags a bit confusing. For example, the tag ar-EG is used for translations in Egyptian Arabic. However, when looking for what "Egyptian Arabic" is in other languages,…

iana cldr ietf-bcp-47

asked Feb 03 '23 at 20:21

Vasco Lange

votes

0 answers

BCP 47 language tag for Gaelic with overdot

In traditional orthography, Gaelic uses the overdot with certain consonants, instead of appending an "h". For example, "ḃ" is equivalent to "bh". What is the BCP 47 language tag for Gaelic with overdot?

ietf-bcp-47

asked Sep 19 '22 at 10:08

jochen

3,728
2
39
49

votes

1 answer

converting TrueType Macintosh Language Codes to BCP 47 language tags

Truetype fonts use "Macintosh Language Codes" to describe the language of localised strings in the "name" table. A list of language codes can be found in in the TrueType spec. I need to convert these language codes to BCP 47 language tags. Is…

locale truetype ietf-bcp-47

asked Aug 24 '22 at 11:00

jochen

3,728
2
39
49

votes

0 answers

How to validate xml:lang ATTLIST inside XML with DTD?

Many articles on the internet (like this one) suggest using xml:lang or some custom attribute to encode meta-information about language inside XML tags. They mention that these codes have to comply with BCP47 standard. Let's see what would happen if…

xml-validation dtd dtd-parsing ietf-bcp-47 xml

asked Jun 07 '19 at 13:24

soshial

5,906
6
32
40

2 Next