0

Intl Extension Uses The RFC 4646 Language Tag

In the PHP intl extension, specifically the Locale class, it says:

Locales are identified using RFC 4646 language tags

RFC 4646 is deprecated

The reason I mention this is that RFC 4646 has been deprecated in favour of RFC 5646.

I had intended to use the intl extension to translate ISO 639 codes to their respective language name (e.g. en to English). Whilst this is not essential, it would be a helpful ability to have available.

ISO 639 changes frequently

The 639 language codes are actively maintained and altered by their respective registration authorities. This is not a hypothetical concern either, just see the change-log for ISO 639-3.

The use of the now deprecated language tag leaves me a little concerned about how actively the intl extension is being maintained; RFC 5656 was released in 2009, this is hardly a cutting edge standard.

Question

  • Are the ISO-639 language codes used by intl updated regularly?
  • Can intl be relied upon as an authoritative lookup resource for ISO 639?
  • If not, is there an authoritative lookup resource for ISO 639 in PHP?
Community
  • 1
  • 1
  • 1
    The _intl_ extension relies on ICU to supply language codes, among other things. The ICU version is stored in the constant `PHP_ICU_VERSION`. However, I'm not familiar with how ICU stores language codes, except that it may likely rely on the Unicode Common Locale Date Repository (CLDR). – Peter O. Mar 18 '16 at 00:20
  • @PeterO. *ICU* is actively maintained with public change-logs and the `PHP_ICU_VERSION` constant will give me a suitable means of tracking to make sure that *intl* is serving up-to-date information. So, with a little bulking out, I'd accept your comment as the answer. –  Mar 18 '16 at 00:51
  • The age of the standard has nothing to do with the specific codes used, much less their translation into various languages (which is your specific question). Referring to RFC 4646 or 5646 is not the best practice - a better citation would be "IETF BCP 47". You will note that both RFCs are designated "BCP 47". – Steven R. Loomis Mar 18 '16 at 15:37
  • @StevenR.Loomis I referred to the RFCs because the *intl* documentation refers to RFC 4646 (not BCP 47). Regardless, the [IETF BCP 47](https://tools.ietf.org/html/bcp47) standard explicitly [describes the ways in which RFC 4646 is now obsolete](https://tools.ietf.org/html/bcp47#section-8); the introduction of 639-3 and 639-5 codes is pretty significant. Put simply, the tags described in RFC 4646 are no longer exhaustive. So... what's the problem with my question? –  Mar 18 '16 at 15:47
  • 1
    @PeterTòmasScott I meant that the PHP Intl docs you mentioned ought to refer to BCP 47, not to the RFCs. Of more concern than the age of the RFC is: *are updates to ISO 639 incorporated? * when was the localization data updated? * How was the localization data collected and vetted? As was noted, ICU/CLDR pull in updated ISO codes and update localized data regularly. So let's start from the ICU version to answer your question. That said, CLDR does not comprehensively translate all valid ISO-639 language identifiers. Summary: Your Q was fine, docs should be clearer. – Steven R. Loomis Mar 18 '16 at 15:52
  • @StevenR.Loomis ah, I understand now. Yes, with *intl* using ICU/CLDR I'm satisfied that I'm not going to be missing anything providing that the ICU version is kept recent. Having a means of tracing changes in the ISO 639 codes through to the application *(via the ICU version number and, consequently, the ICU change logs)* is very helpful; at least I will have a path for tracing and resolving any problems I encounter. –  Mar 18 '16 at 16:03
  • @PeterTòmasScott That seems like a new, ICU focussed question… can you ask a new question about ISO-639 in ICU and link it here and I'll try to answer? – Steven R. Loomis Mar 18 '16 at 16:05
  • @StevenR.Loomis http://stackoverflow.com/questions/36089760/icu-cldr-iso-639-changes-monitoring-maintenance –  Mar 18 '16 at 16:39

1 Answers1

1

The intl extension relies on ICU to supply language codes, among other things. The ICU version is stored in the constant PHP_ICU_VERSION. ICU itself relies on the Unicode Common Locale Data Repository (CLDR) as specified in the same documentation you cited in the Locale class: "The extensions used by CLDR in [Unicode Standard Annex] #35 (and inherited by ICU) are valid and used wherever they would be in ICU normally."

Peter O.
  • 32,158
  • 14
  • 82
  • 96