1

I'm writing a python module designed to work with displaying and entering Emoji in pygame. This means I'm often working with non-BMP Unicode characters with apparently the python shell doesn't like.

I've made a custom string-like object to make dealing with emoji characters and sequences easier by storing emoji sequences as a single character. However, although I'd like for str(self) to return the object's raw Unicode representation, this causes problems when attempting to print out or, even worse, when it's included in an error message.

This is an example of what happens when a non-BMP character is included in the error message. Running Python 3.7.3 on Windows 10.

>>> raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    raise ValueError('Beware the non-BMP! \U0001f603')
Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 474, in runcode
    exec(code, self.locals)
  File "<pyshell#0>", line 1, in <module>
Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 474, in runcode
    exec(code, self.locals)
  File "<pyshell#0>", line 1, in <module>
ValueError: 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 144, in main
    ret = method(*args, **kwargs)
  File "D:\Python37\lib\idlelib\run.py", line 486, in runcode
    print_exception()
  File "D:\Python37\lib\idlelib\run.py", line 234, in print_exception
    print_exc(typ, val, tb)
  File "D:\Python37\lib\idlelib\run.py", line 232, in print_exc
    print(line, end='', file=efile)
  File "D:\Python37\lib\idlelib\run.py", line 362, in write
    return self.shell.write(s, self.tags)
  File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
    return self.asyncreturn(seq)
  File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
    return self.decoderesponse(response)
  File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
    raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Python37\lib\idlelib\run.py", line 158, in main
    print_exception()
  File "D:\Python37\lib\idlelib\run.py", line 234, in print_exception
    print_exc(typ, val, tb)
  File "D:\Python37\lib\idlelib\run.py", line 220, in print_exc
    print_exc(type(context), context, context.__traceback__)
  File "D:\Python37\lib\idlelib\run.py", line 232, in print_exc
    print(line, end='', file=efile)
  File "D:\Python37\lib\idlelib\run.py", line 362, in write
    return self.shell.write(s, self.tags)
  File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
    return self.asyncreturn(seq)
  File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
    return self.decoderesponse(response)
  File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
    raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Python37\lib\idlelib\run.py", line 162, in main
    traceback.print_exception(type, value, tb, file=sys.__stderr__)
  File "D:\Python37\lib\traceback.py", line 105, in print_exception
    print(line, file=file, end="")
  File "D:\Python37\lib\idlelib\run.py", line 362, in write
    return self.shell.write(s, self.tags)
  File "D:\Python37\lib\idlelib\rpc.py", line 608, in __call__
    value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
  File "D:\Python37\lib\idlelib\rpc.py", line 220, in remotecall
    return self.asyncreturn(seq)
  File "D:\Python37\lib\idlelib\rpc.py", line 251, in asyncreturn
    return self.decoderesponse(response)
  File "D:\Python37\lib\idlelib\rpc.py", line 271, in decoderesponse
    raise what
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 32-32: Non-BMP character not supported in Tk

=============================== RESTART: Shell ===============================

As you can see, it looks like the shell gets into an infinite loop trying to deal with the error, then restarts the shell to prevent getting stuck. Is there any way I could a) make str work differently for the error handler or b) prevent the shell restart so the error displays properly?

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
Oliver
  • 51
  • 7
  • I think IDLE (or more specifically, the tcl code in tkinter) has problems with emoji - you might need to use a different editor. – snakecharmerb Apr 22 '19 at 11:00
  • While I could use a different editor myself, my intention is to make this into a module for others to use with their projects, regardless of which editor _they_ are using. I would prefer a solution rather than a workaround otherwise this error will still appear for other users. – Oliver Apr 22 '19 at 11:31
  • That's reasonable. Until tcl/tk ships with full unicode support as standard you need some kind of workaround for IDLE. Would it be feasible to replace your emoji class with a subclass whose `__str__` method emits an alternative representation if you detect that you are running in IDLE? – snakecharmerb Apr 22 '19 at 13:28
  • This question [link](https://stackoverflow.com/questions/3431498/what-code-can-i-use-to-check-if-python-is-running-in-idle) seems to provide a way to determine whether the code is being run in the IDLE, however replacing the `__str__` method as such is likely to ruin any user interactions with the class. Of course, it would be impossible to determine what the user is going to do with the resulting string while converting, unless you could detect the caller of the function or inspect the stack to see if it's being called inside a print function. – Oliver Apr 22 '19 at 15:31
  • This is fixed in 3.9, 3.8, 3.7 repositories. `raise Exception('')` results in a traceback ending with `Exception('')`. Fix is too late for upcoming 3.7.5 and 3.8.0 but should appear in 3.7.6 and 3.8.1. – Terry Jan Reedy Oct 04 '19 at 20:07

1 Answers1

1

Taking ideas from snakecharmerb and these two questions, I've implemented some code that checks whether the module is being run in the IDLE and if so, whether the function is being called by the error handler. Tests appear to be working fine. I've got the following checking for an IDLE running environment

IN_IDLE = False
for item in ['idlelib.__main__','idlelib.run','idlelib']:
    IN_IDLE = IN_IDLE or item in sys.modules

And below is the new __str__ function

    def __str__(self):
        """ Return str(self). """
        if IN_IDLE:
            # Check for caller. If string is being printed, modify
            # output to be IDLE-friendly (no non-BMP characters)
            callername = sys._getframe(1).f_code.co_name
            if callername == '_some_str':
                rstr = ''
                for char in self.__raw:
                    if ord(char) > 0xFFFF:
                        rstr += '\\U'+hex(ord(char))[2:].zfill(8)
                    else:
                        rstr += repr(char)[1:-1]
                return rstr
            else:
                return self.__raw
        else:
            return self.__raw

Where self.__raw holds the raw text representation of the object. I'm caching it to improve efficiency since the objects are intended to be immutable.

Of course, while this does work around the issue, I feel like python shouldn't do an entire shell restart when this occurs. Will post on bugs.python.org

EDIT: Posted on bugs.python.org as issue 36698

Oliver
  • 51
  • 7
  • This is fixed in 3.9, 3.8, 3.7 repositories. `raise Exception('')` results in a traceback ending with `Exception('')`. Fix is too late for upcoming 3.7.5 and 3.8.0 but should appear in 3.7.6 and 3.8.1. – Terry Jan Reedy Oct 04 '19 at 20:10