0

When writing one-off scripts for file management, I am often using the print function to verify if e.g. the list of files I am operating on is what I intend it to be. Consider e.g.

for path in glob.glob("*.mp3"):
    print(path)

On Windows (including Cygwin when LANG=C is used, so this may apply to Unix too) this will raise an UnicodeEncodeError, if the file names include unicode characters.

In Cygwin, when using print(repr(path)) instead, unsupported characters are escaped as \uxxxx. In the Windows console however, it will still raise an UnicodeEncodeError.

Partial Solution

The closest things to solutions I found were

unicodestring = "Hello \u2329\u3328\u3281\u1219 World"

print(repr(unicodestring).encode("utf8").decode(sys.stdout.encoding))
# Breaks even supported characters

print(unicodestring.encode("unicode-escape").decode("ascii"))

both of which are rather verbose for "quick and dirty" scripts, especially when the print call consists of multiple strings with possible non-ascii content.

Similar questions exist for Python 2, but the solutions don't usually work with Python 3. Also, they are often equally verbose.

kdb
  • 4,098
  • 26
  • 49
  • The quick and dirty answer would be `unicodestring.encode(sys.stdout.encoding, errors='replace')`. Encoding with one codec and decoding with another doesn't make sense. It's like encoding an image using PNG and decoding it as JPEG. Anyway there are plenty of better answers to your question elsewhere on StackOverflow. – roeland Jul 09 '15 at 23:33
  • The verbosity still remains an issue for quick-and-dirty scripts though. A full example with `print` (in python3 where `encode` doesn't return a string and hence is printed as binary-array literal) whould read `print(unicodestring.encode(sys.stdout.encoding,errors="replace").decode(sys.stdout.encoding))` which I already consider plenty long for a script maybe only a few lines long. – kdb Jul 11 '15 at 15:48
  • 1
    @kdb, for Python 3 you can use [win-unicode-console](https://github.com/Drekin/win-unicode-console). – Eryk Sun Jul 12 '15 at 17:11

0 Answers0