How do I get compatible type() behaviour in python 2 & 3 with unicode_literals?

Question

This question looks strikingly similar to this one, however the suggestion in the comments there doesn't work (anymore?) as demonstrated below.

I'm trying to write a python2-3 compatible package, and one of my methods has a class generator in it, and type() is giving me problems in the python-2.7 tests:

Python 2.7.13 (default, Mar 18 2017, 17:03:32) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> from builtins import str
>>> type('MyClass', (object,), {})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type() argument 1 must be string, not unicode
>>> type(str('MyClass'), (object,), {})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type() argument 1 must be string, not newstr

The Python-Future overview page says:

# Compatible output from isinstance() across Py2/3:
assert isinstance(2**64, int)        # long integers
assert isinstance(u'blah', str)
assert isinstance('blah', str)       # only if unicode_literals is in effect

I expected this would give me consistent behaviour anywhere that strings are required, but apparently not.

What's the correct, version-independent, way to do this? The other question I linked to was asked in the era of python-2.6, and it seems like the behaviour has changed since then. I don't think I can just dump unicode_literals, since I run into portability problems (elsewhere) with calls to hashlib if I don't have it.

why you import unicode_literals? it appear to be more problems than solution, for this particular example, you don't need it for it to work in both py2 and py3 — Copperfield, Mar 22 '17 at 01:39
unless you are doing some string manipulation that require a distinction between bytes string and unicode string, and/or you are doing decode and encode, etc, then you most likely would not need unicode_literal, leave it as it is, and it would work fine in py2 and py3 — Copperfield, Mar 22 '17 at 02:00
I can't guarantee that strings passed to this library will never be Unicode. And there are some methods that explicitly require Unicode strings. It seems that the recommended way to deal with that in 2-3 code is to use unicode_literals. — mpounsett, Mar 22 '17 at 03:56

score 5 · Accepted Answer · edited Jan 30 '20 at 18:09

5

Don't use builtins.str(), use the plain str that comes with your Python version:

>>> from __future__ import unicode_literals
>>> type(str('MyClass'), (object,), {})
<class '__main__.MyClass'>

This works both in Python 2 and 3. If the future.builtins module replaces the str built-in type by default, use the __builtin__ module:

try:
    # Python 2
    from __builtin__ import str as builtin_str
except ImportError:
    # Python 3
    from builtins import str as builtin_str

MyClass = type(builtin_str('MyClass'), (object,), {})

edited Jan 30 '20 at 18:09

rofls

4,993
3
27
37

answered Apr 10 '17 at 20:36

Martijn Pieters

1,048,767
296
4,058
3,343

It looks like I actually need to use a mixture of this and builtins.str() ... in most places in my code I need the Python3 compatible behaviour of builtins.str() (for example expecting issubclass(x, str) to return True for unicode literals), and then for some very few specific cases (like type()) I need \_\_builtin\_\_.str(). Does that seem right? – mpounsett Apr 11 '17 at 19:09
@mpounsett: sure, and all you need to do is make sure you keep the two separate by assigning them to different names. Because the `future.builtins` module is meant to be a drop-in shim for the Python 3 `builtins` module, just keep using `from builtins import str`, but use the `try..except ImportError` with `import .. as ...` trick to keep access to the Python 2 `str` object under a different name. – Martijn Pieters Apr 11 '17 at 19:19

How do I get compatible type() behaviour in python 2 & 3 with unicode_literals?

1 Answers1