2

Context

Getting the locale with python on Windows seems to be broken:

(trash0) PS C:\Users\myname\venv\trash0\Lib\site-packages> python.exe
Python 3.6.3 (v3.6.3:2c5fed8, Oct  3 2017, 17:26:49) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.platform
'win32'
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'fr-FR')
'fr-FR'
>>> locale.getlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\Python36-32\lib\locale.py", line 581, in getlocale
    return _parse_localename(localename)
  File "C:\Program Files\Python36-32\lib\locale.py", line 490, in _parse_localename
    raise ValueError('unknown locale: %s' % localename)
ValueError: unknown locale: fr-FR
>>>

I do not know much about Windows but I have checked fr-FR belongs to the correct locale names for Windows. Note that using en-US or en-GB get the same result.

Yet setting the locale works correctly because:

  • using locale.setlocale() with any unknown value would raise an exception:
  >>> locale.setlocale(locale.LC_ALL, 'anythingundefined')
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "C:\Program Files\Python36-32\lib\locale.py", line 598, in setlocale
      return _setlocale(category, locale)
  locale.Error: unsupported locale setting
  >>>
  • once set, it's possible to check it is taken into account:
  >>> locale.setlocale(locale.LC_ALL, 'fr-FR')
  'fr-FR'
  >>> locale.str(12.3)
  '12,3'
  >>> locale.setlocale(locale.LC_ALL, 'en-GB')
  'en-GB'
  >>> locale.str(12.3)
  '12.3'
  >>>

Question

I need to temporarily set the locale to en-US (in order to perform some things that require this locale) and then switch back to the default locale. How is it possible to do it if locale.getlocale() is broken? I've read the python doc about locale but can't figure out any workaround to achieve this (nor whether it is possible at all).

martineau
  • 119,623
  • 25
  • 170
  • 301
zezollo
  • 4,606
  • 5
  • 28
  • 59
  • You could monkey-patch the `locale` module and replace `setlocale()` with your own function that remembered the last locale successfully set, and likewise replace `getlocale()` with a function that uses that saved value if there is one. – martineau Dec 20 '17 at 16:51
  • `locale.getlocale` hasn't been updated to support parsing Windows locale names that are delimited by a hyphen (e.g. "fr-FR"). These names are based on RFC 4646 language tags. They were introduced in Vista, along with many new NLS functions that use locale names instead of LCIDs. – Eryk Sun Dec 20 '17 at 23:46
  • You can use the legacy C runtime locale strings that are delimited by an underscore. These use 3-letter abbreviations such as "FRA_FRA[.codepage]" or "ENU_USA.[codepage]", or long names such as "French_France[.codepage]" or "English_United States[.codepage]". The optional codepage is the numeric identifier of a legacy codepage, such as "ENU_USA.1252". – Eryk Sun Dec 20 '17 at 23:48
  • @martineau Thanks for your suggestion! I've been thinking about it but I think I'm still stuck in the same place: I cannot figure out how a monkey-patched version of `getlocale()` could find out to which value the user *may* have set the locale previously (before calling my library's functions at all). – zezollo Dec 21 '17 at 11:16
  • @eryksun Thanks for these precious informations! I have been thinking that maybe it would be up to the user to choose a working value, after all. If the user sets its locale to `fra_fra`, `getlocale()` does not raise an exception. Only problem then is, the value it returns cannot be used by `setlocale()`, so it's not possible to set the locale back to a usable value, *unless* it would be possible to use a kind of conversion table that would let e.g. `fra_fra` match the value returned by `getlocale()`. I couldn't find the list of legacy codes, is there any web page that still lists them? – zezollo Dec 21 '17 at 11:22
  • zezollo: Once the `locale` module is monkey-patched, all uses of it from then on in the same run of the interpreter will use the modified library when it's imported due to the way the system caches already loaded libraries. This means that if you can somehow arrange for the modifications to occur near the beginning of execution, then it **will** be possible to know the last value passed to `setlocale()`, for example, by other pieces of code because the mods were in place before it called. – martineau Dec 21 '17 at 13:42
  • @martineau Do you mean somehow that importing my library will replace any further imports of `locale` by the monkey patched version? I don't exactly know how to do this however, but I guess this is an other question. I'm going to investigate this. – zezollo Dec 21 '17 at 14:17
  • zezollo: Yes, that's what I meant. This is described in a little more detail in [this answer](https://stackoverflow.com/a/18561055/355230) of mine (to an unrelated question). If you want, I could post an answer showing how to do it to the `locale` module. – martineau Dec 21 '17 at 14:21
  • @martineau yes, why not, as this seems to be a possible workaround. I'll check it for sure, even if I'll need some time to read this all throughout (including the link of your previous comment) and do some tests. – zezollo Dec 21 '17 at 14:43
  • zezollo: Breaking news: While writing a monkey-patch for you and looking at the source code for the `locale` module, I noticed that the reason you're getting the `ValueError: unknown locale: fr-FR` exception is because the `locale` module really doesn't think the `'fr-FR'` is valid. Try using `locale.setlocale(locale.LC_ALL, 'fr')` instead. Note I'm running Windows 7. – martineau Dec 21 '17 at 15:36
  • zezollo: The reason why it matters whether the `locale` module likes the locale string in the monkey-patch is because I wanted to make it to only remember values that were used successfully (i.e. valid), not any that were tried by didn't work—so it can properly track of the current state of the module. – martineau Dec 21 '17 at 15:52
  • @martineau Excellent! This may already be a good work around: let the user set the locale "correctly" (i.e. using `fr`, or `fra_fra`) and then, when getting the locale (as `fr_FR`) "parse" it to use only the first part (`fr`) to set it back. In order to ensure the codepage is taken into account too, a kind of conversion table between the value returned by `getlocale()` and the one to really use (`fr` or `fra_fra` like suggested by @eryksun) would be useful (hence find a list of legacy 3 letters codes). A monkey patch looks even better though (would not depend on such a conversion table). – zezollo Dec 21 '17 at 16:19
  • zezollo: Sorry I don't follow your logic. Why would it be correct for the user to get `fr_FR` back when it must have been set using something else (because the user set the locale "correctly")? – martineau Dec 21 '17 at 16:30
  • @martineau Because when setting the locale to `fr-FR`, `getlocale()` raises an exception; setting it to `fr` lets `getlocale()` return `('fr_FR', 'ISO8958-1')`; and setting it to `fra_fra` lets `getlocale()` return `('fr_FR', 'cp1252')`, so we never get back the value actually used. Plus, in all cases, `fr_FR` cannot be used to set the locale. So, a monkey patch that would remember the last value used to successfully set the locale looks definitely better, actually (I realize this would work even if the user did set the locale to `fr-FR`; my "workaround" would not work in that case). – zezollo Dec 21 '17 at 16:45
  • zezollo: The issue is that `setlocale()` doesn't raise an exception when it's called with `'fr-FR'`, so at that point there's no indication whether it thinks it was valid or not (despite what the documentation says). An exception isn't raised until a call to `getlocale()` is made later. I suppose the monkey-patch code could call `getlocale()` itself everytime one is made to `setlocale()` in order to determine if it worked, but that seems overly hacky to me. – martineau Dec 21 '17 at 17:04
  • @martineau: if I'm not wrong and understand all well, a monkey patch that would provide the last value used in `setlocale()` (any value that did not raise an exception: be it `'fr-FR'`, `'fr'` or `'fra_fra`) would be a correct solution. If this value is made available (via a `locale.setting_value()` call or something alike), then it's always possible to use it later in `setlocale()`. – zezollo Dec 21 '17 at 17:25
  • @zezollo, `locale.getlocale` tries to be helpful when parsing the locale. In this case "French_France" gets mapped to "fr_FR" via the `locale.locale_alias` mapping, but Windows does not use POSIX locale strings, so this is all broken. Use `locale._setlocale(category, None)` to query the exact locale string for the given category. – Eryk Sun Dec 21 '17 at 20:52

1 Answers1

2

Here's how to monkey patch the locale module as I was trying to describe in my comments under your question.

First the monkey-patching module, locale_patch.py:

""" Module that monkey-patches the locale module so it remembers the last
arguments to setlocale() that didn't raise an exception and will allow them to
be retrieved later by calling a new function named setting_values() which also
gets added.
"""
import locale as _locale

_last_category, _last_locale = None, None

def my_setlocale(category, locale=None):
    global _last_category, _last_locale

    try:
        result = _orig_setlocale(category, locale)
    except _locale.Error:
        raise  # Didn't work, ignore arguments.

    if locale is not None:  # Was a setting modified by call?
        _last_category, _last_locale = category, locale  # Remember args.

    return result

def setting_values():
    global _last_category, _last_locale

    if _last_category is None:
        raise _locale.Error('setlocale() has never been called to change settings')

    return _last_category, _last_locale


# Monkey-patch the module.
_orig_setlocale = _locale.setlocale
_locale.setlocale = my_setlocale
_locale.setting_values = setting_values  # New module function.

Sample usage:

import locale
import locale_patch  # Apply monkey-patch(es).

try:
    locale.setlocale(locale.LC_ALL, 'fr_FR.UTF-8') # locale.Error: unsupported locale setting
except locale.Error:
    print("locale.setlocale(locale.LC_ALL, 'fr_FR.UTF-8') didn't work")
try:
    print(locale.setting_values())
except locale.Error:
    print("locale.setting_values() didn't work")  # Expected.

try:
    locale.setlocale(locale.LC_ALL, locale='fr_FR.UTF-8')
except locale.Error:
    print("locale.setlocale(locale.LC_ALL, locale='fr_FR.UTF-8') didn't work")
try:
    print(locale.setting_values())
except locale.Error:
    print("locale.setting_values() didn't work")  # Expected.

locale.setlocale(locale.LC_ALL, 'fr-FR')
results = locale.setting_values()
print(results)  # -> (0, 'fr-FR')  # The 0 is the numeric value of locale.LC_ALL

locale.setlocale(*results)  # Works OK.
martineau
  • 119,623
  • 25
  • 170
  • 301
  • Many thanks, this is very clear and actually not as complicated as I thought would be. If I understand everything well, I do `import locale_patch` in the `__init__.py` of my library and this will cause importing of my library to automatically patch the locale. Then the only situation where `RuntimeError` may be inappropriately raised is if the user sets the locale *before* importing my library, what looks like not really good code (import statements should be at start of code). – zezollo Dec 22 '17 at 09:29
  • There's a small error to fix in the code of your answer: using `locale` as keyword argument of `my_setlocale()` locally overrides `locale` and in case of an error raised by the original `setlocale()` then instead of raising it further, the expression `locale.Error` raises itself an exception (to complain that `str` does not have an `Error` attribute). I renamed this keyword to `value` and everything works fine. – zezollo Dec 22 '17 at 09:32
  • @zezollo: Good catch. It was that way because I just copied what's shown in the [current documentation](https://docs.python.org/3/library/locale.html#locale.setlocale)—which of course isn't "real" code. FWIW, in my update I renamed it `locale_name` only because it's a little more descriptive than simply `value`. `;¬)` – martineau Dec 22 '17 at 15:28
  • Well my correction is not as good as it looked like, because `locale` is *indeed* used as keyword in the original `setlocale()` (as in the doc...), so the patch fails on calls that were correct with the original `setlocale`, like `setlocale(locale.LC_ALL, locale='fr_FR.UTF-8')` (that raise `TypeError: my_setlocale() got an unexpected keyword argument 'locale'`). – zezollo Dec 23 '17 at 07:40
  • So I thought I could replace this offending signature by `my_setlocale(category, **kwargs)` and inside the function, add `locale_name = kwargs.get('locale', None)` but then a call like `setlocale(locale.LC_ALL, 'fr_FR.UTF-8')`, that *is considered correct* by the original function, will make the patch fail (this raises `TypeError: my_setlocale() takes 1 positional argument but 2 were given`). The choice of the developers to use a module's name as keyword argument in the very same module makes it difficult to monkey patch it without breaking anything! – zezollo Dec 23 '17 at 07:51
  • zezollo: Assuming I've understood everything you're now saying, see if the latest update to my answer resolves them. Note that although `setlocale(locale.LC_ALL, 'fr_FR.UTF-8')` will now be considered a correct _call_ to the function (and now also by the one in the patch), the value isn't considered valid and it raises a `locale.Error: unsupported locale setting`—so the patch won't remember the arguments that were used which means calling `setting_values()` afterwards will raise its own exception (by design). The sample usage section now shows this. – martineau Dec 23 '17 at 16:43
  • 1
    Sorry for the delay. Your solution (`import locale as _locale` and related changes) is simple, looks best and works fine on both Linux and Windows (sorry for this `fr_FR.UTF-8`, that can be used on Linux only). Many thanks! And a happy new year! – zezollo Jan 01 '18 at 18:02
  • zezollo: Better late than never. Thanks and happy New Year! – martineau Jan 01 '18 at 19:23