pprint
prints the representation of the object you pass it. From the docs
The pprint module provides a capability to “pretty-print” arbitrary
Python data structures in a form which can be used as input to the
interpreter.
And "a form which can be used as input to the interpreter" means you get the object's representation, i.e., what its __repr__
method returns.
If you want strings to be printed using their __str__
method instead of their __repr__
then don't use pprint
.
Here's a Python 3 code snippet that looks for chars that get represented using a \u
escape code:
for i in range(1500):
c = chr(i)
r = repr(c)
if r'\u' in r:
print('{0:4} {0:04x} {1} {2}'.format(i, r, c))
output
888 0378 '\u0378'
889 0379 '\u0379'
896 0380 '\u0380'
897 0381 '\u0381'
898 0382 '\u0382'
899 0383 '\u0383'
907 038b '\u038b'
909 038d '\u038d'
930 03a2 '\u03a2'
1328 0530 '\u0530'
1367 0557 '\u0557'
1368 0558 '\u0558'
1376 0560 '\u0560' ՠ
1416 0588 '\u0588' ֈ
1419 058b '\u058b'
1420 058c '\u058c'
1424 0590 '\u0590'
1480 05c8 '\u05c8'
1481 05c9 '\u05c9'
1482 05ca '\u05ca'
1483 05cb '\u05cb'
1484 05cc '\u05cc'
1485 05cd '\u05cd'
1486 05ce '\u05ce'
1487 05cf '\u05cf'
Note that codepoints > 0xffff get represented using a \U
escape code, when necessary.
for i in range(65535, 65600):
c = chr(i)
r = repr(c)
if r'\u' in r.lower():
print('{0:4} {0:04x} {1} {2}'.format(i, r, c))
output
65535 ffff '\uffff' �
65548 1000c '\U0001000c'
65575 10027 '\U00010027'
65595 1003b '\U0001003b'
65598 1003e '\U0001003e'