0

The specific difference I encountered was with Unicode strings. First, I initialized a Unicode string with some Unicode characters in it. I then initialized in another variable the encoded Unicode string in UTF-8. Upon printing out both the encoded and the Unicode string, I see the correct string with the correct special characters. However, when I simply enter the variables in the REPL, I do not see this. I am not entirely sure what the REPL is outputting, as Unicode is new to me. This is the input and output to the REPL:

>>> unicode_string = u"Fu\u00dfb\u00e4lle"
>>> encoded = unicode_string.encode('utf-8')
>>> print encoded
Fußbälle
>>> encoded
'Fu\xc3\x9fb\xc3\xa4lle'
>>> print unicode_string
Fußbälle
>>> unicode_string
u'Fu\xdfb\xe4lle'

Please bear with me if I misused any terminology, this whole concept of encodings and abstract representations of characters is extremely new to me.

Rohan
  • 482
  • 6
  • 16
  • @Sayse: uhm, no. That doesn't explain that the REPL uses `print repr(result)`. – Martijn Pieters Feb 21 '17 at 07:55
  • @MartijnPieters - True.. still looking for a better option.. – Sayse Feb 21 '17 at 07:56
  • The `repr()` output for strings produces exact syntax to reproduce the same value again, using only ASCII characters and escape sequences. For Unicode codepoints in the Latin-1 range but outside ASCII are represented using the `\xhh` escape sequence rather than the more verbose `\u00hh` sequences. – Martijn Pieters Feb 21 '17 at 07:58
  • @MartijnPieters so the REPL calls the function `repr()` when outputting the variable content? Also, I am never able to find duplicates to questions I ask. Could you tell me how you do so such that I don't post more duplicate questions? – Rohan Feb 21 '17 at 19:53
  • Yes, all expression statement results are echoed by printing the `repr()` output of the result, except when that result is `None`. As for duplicates, don't worry too much, I just have more context and experience to help search for posts. – Martijn Pieters Feb 21 '17 at 22:42

0 Answers0