1

This bit of code solves a problem I had. however, the "setdefaultencoding" is not available without reload.

what is this quirk of the language called? why wasn't i told earlier? where can i read more about it.

import sys;
reload(sys);
sys.setdefaultencoding("utf8")

FROM

http://mypy.pythonblogs.com/12_mypy/archive/1253_workaround_for_python_bug_ascii_codec_cant_encode_character_uxa0_in_position_111_ordinal_not_in_range128.html

vish
  • 1,046
  • 9
  • 26

1 Answers1

9

The 'quirk' is the site module deliberately deleting the sys.setdefaultencoding() function:

# Remove sys.setdefaultencoding() so that users cannot change the
# encoding after initialization.  The test for presence is needed when
# this module is run as a script, because this code is executed twice.
if hasattr(sys, "setdefaultencoding"):
    del sys.setdefaultencoding

You should not use it! Setting the default encoding to UTF-8 is like strapping a stick to your leg after you broke it and walking on instead of having a doctor set the broken bones.

Really, let me make it clear: There is a reason it is removed and the reason is that you'll a) break any module that relies on the normal default and b) you are masking your actual problems, which is handling Unicode correctly by decoding as early as possible and postponing encoding until you need to send the data out again.

That out the way, the way the reload() function works is that it lets you bypass the module cache; import will load a Python module only once; subsequent imports give you the already-loaded module. reload() loads the module a-new as if it was never imported, and merges the new names back into the existing module object (to preserve extra names added later):

Reload a previously imported module. The argument must be a module object, so it must have been successfully imported before. This is useful if you have edited the module source file using an external editor and want to try out the new version without leaving the Python interpreter. The return value is the module object (the same as the module argument).

When reload(module) is executed:

  • Python modules’ code is recompiled and the module-level code reexecuted, defining a new set of objects which are bound to names in the module’s dictionary. The init function of extension modules is not called a second time.
  • As with all other objects in Python the old objects are only reclaimed after their reference counts drop to zero.
  • The names in the module namespace are updated to point to any new or changed objects.
  • Other references to the old objects (such as names external to the module) are not rebound to refer to the new objects and must be updated in each namespace where they occur if that is desired.

So reload() restores the deleted sys.setdefaultencoding() name into the module.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • wait, so what is different when i "import sys" and when i reload it? is site loaded inside sys, but not reloaded when i reload it? (i agree about the unicode, i just didn't understand the magic of reload) – vish May 28 '14 at 18:04
  • @vish: Python caches modules. `import sys` gives you the already loaded module. `reload()` really re-imports, and merges the loaded module back into the cache. – Martijn Pieters May 28 '14 at 18:05