-1

I am porting my code to python 3 with maintaining backwards compatibility.

The str function in python 2 and python 3 convert strings with non-ascii characters differently. For example:

Python 2:

In [4]: str('Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve')
Out[4]: 'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. L\xc3\xb6ve & D. L\xc3\xb6ve'

But in Python 3:

In [1]: str('Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve')
Out[1]: 'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve'

Python 3 How can I get the same representation in Python 2? I am writing the strings to a sqlite3 table.

goelakash
  • 2,502
  • 4
  • 40
  • 56
  • If you want unicode, use unicode. `print u'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve'` does what you want in Python 2. – Two-Bit Alchemist Jun 07 '16 at 18:44

1 Answers1

1

It appears what you want is a unicode string literal. In Python 3, all normal string literals are unicode string literals. In Python 2, only unicode values are unicode strings. Creating a unicode string literal in Python 2 is accomplished by putting a u in front of the literal:

u'Alnus viridis (Chaix) DC. ssp. sinuata (Regel) A. Löve & D. Löve'

This is the same representation as your Python 3 string. Note that if your source file is in UTF-8 encoding, you need to add a special comment to indicate this, on the first or second line, such as:

# -*- coding: utf-8 -*-

For more information on this, see PEP 263 or this other question.

Community
  • 1
  • 1
Dan Getz
  • 8,774
  • 6
  • 30
  • 64
  • This too gives an error in Python 2. I need to keep the APIs consistent for python 2/3 compatibility. I would prefer to keep the human-readable format (i.e., get python3 type conversion). – goelakash Jun 07 '16 at 18:12
  • Sorry for the confusion. I think this now answers your question as written? – Dan Getz Jun 07 '16 at 18:17
  • @goelakash Oh wait, are you trying to make a source file that runs correctly on both Python 2 and Python 3? If so, please edit your question to say this. – Dan Getz Jun 07 '16 at 18:22
  • Yes, sorry for any doubts. So would this header maintain compatibility for `utf-8` strings b/w python 2 and 3? It seems to work for a sample case: `print(s,str(s))` where `s = 'A. Löve & D. Löve'`. – goelakash Jun 07 '16 at 18:33
  • No, I was not attempting to write Python 3 code that runs in Python 2. I was showing how to write code in Python 2 that does the same as your Python 3 code. I think you haven't put enough information in your question to answer it in terms of having your code work on both versions. You speak of "encoding", but unicode strings aren't encoded in UTF-8 in Python. They *can be* encoded *into* UTF-8, but that's different. Could you edit your question to add your real requirements for your code? How are you using the value after it's created? – Dan Getz Jun 07 '16 at 18:36