On POSIX systems, it depends entirely on how your console or terminal is configured what encoding is used for those strings.
In those environments, use locale.getpreferredencoding()
to query what encoding was configured, then use that to decode the string. This is not foolproof, but should work whenever the console or terminal was configured correctly.
In your specific case you probably are using a Windows system configured to use Windows Codepage 1252:
>>> '\x80'.decode('cp1252')
u'\u20ac'
>>> print '\x80'.decode('cp1252')
€
Windows does provide the GetCommandLineW()
and CommandLineToArgvW()
functions to retrieve the Unicode value for the command line, and then parse that value into an argv
-like array; using this from Python can be done with the ctypes
library; paraphrasing this example this is how you could use it:
from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
from ctypes.wintypes import LPWSTR, LPCWSTR
GetCommandLineW = WINFUNCTYPE(LPWSTR)(("GetCommandLineW", windll.kernel32))
CommandLineToArgvW = WINFUNCTYPE(POINTER(LPWSTR), LPCWSTR, POINTER(c_int))(("CommandLineToArgvW", windll.shell32))
argc = c_int(0)
argv_unicode = CommandLineToArgvW(GetCommandLineW(), byref(argc))