0

I'm currently fetching mailbox names from an imaplib.IMAP4_SSL conn connection like this:

mbs = [e.decode(encoding) for e in conn.list()[1] if isinstance(e, bytes)]

(Python 3.8 on Ubuntu 18.4, connection to imap.gmx.com, btw)

But regardless which encoding I'm using I always get the same incorrectly decoded strings: ['Entw&APw-rfe', 'Gel&APY-scht',...]

I've tried all encodings known to me:

{'gb18030', 'hz', 'quopri_codec', 'cp861', 'hp_roman8', 'cp424', 'cp869', 'kz1048', 
'ascii', 'cp273', 'zlib_codec', 'cp1125', 'cp1257', 'latin_1', 'iso8859_9', 'mbcs', 'cp852', 
'utf_16_le', 'shift_jis', 'iso2022_jp_1', 'cp857', 'euc_jisx0213', 'mac_greek', 'big5hkscs', 
'utf_7', 'koi8_r', 'big5', 'cp1255', 'mac_iceland', 'tactis', 'mac_turkish', 'base64_codec', 
'cp1026', 'iso8859_5', 'cp858', 'shift_jisx0213', 'iso8859_7', 'mac_roman', 'utf_32_be', 'gbk', 
'gb2312', 'cp866', 'cp1251', 'cp437', 'iso2022_jp_2', 'bz2_codec', 'euc_jis_2004', 
'mac_cyrillic', 'cp865', 'iso2022_jp_ext', 'utf_16_be', 'iso8859_14', 'cp037', 'iso8859_2', 
'iso2022_kr', 'cp950', 'cp860', 'hex_codec', 'cp850', 'iso8859_15', 'tis_620', 'cp855', 
'rot_13', 'cp1253', 'iso8859_4', 'cp1254', 'shift_jis_2004', 'iso8859_3', 'iso8859_11', 
'iso8859_13', 'cp1252', 'iso2022_jp', 'cp932', 'iso2022_jp_3', 'utf_8', 'euc_kr', 'cp1140', 
'cp500', 'utf_16', 'utf_32', 'cp949', 'mac_latin2', 'johab', 'cp1250', 'ptcp154', 'uu_codec', 
'iso8859_16', 'cp862', 'cp864', 'iso8859_6', 'cp775', 'cp1258', 'utf_32_le', 'euc_jp',
'iso8859_10', 'iso8859_8', 'cp1256', 'cp863', 'iso2022_jp_2004'}

So to me it looks like the names got decoded incorrectly and then re-encoded to UTF-8 on the way to me.

What am I doing wrong here?

frans
  • 8,868
  • 11
  • 58
  • 132
  • 1
    That uses an encoding known as "modified utf7", "mutf-7" or similar. It's a pain. The author posted a public apology for it. I don't know whether you'll find a readymade python decoder and I'd rather forget the whole thing. At least now you know the right search term. Good luck. – arnt Sep 25 '20 at 16:12
  • The answers to the duplicate target linked above list a number of third party packages that will decode modified UTF-7. – snakecharmerb Sep 26 '20 at 06:29

0 Answers0