you can try decoding with ascii
first.
print((msg.get_attachment(0).read_buffer(attach_size)).decode('ascii', errors="ignore"))
I think Microsoft is using more than one way to encode different parts of attachments, so no single decoding can do perfectly. If ascii
cannot decode enough content, you can try them all. For different Python versions, check it out here.
# 98 encodings in python3.5/6/7
decode = ['ascii','big5','big5hkscs','cp037','cp273',
'cp424','cp437','cp500','cp720','cp737',
'cp775','cp850','cp852','cp855','cp856',
'cp857','cp858','cp860','cp861','cp862',
'cp863','cp864','cp865','cp866','cp869',
'cp874','cp875','cp932','cp949','cp950',
'cp1006','cp1026','cp1125','cp1140','cp1250',
'cp1251','cp1252','cp1253','cp1254','cp1255',
'cp1256','cp1257','cp1258','cp65001','euc_jp',
'euc_jis_2004','euc_jisx0213','euc_kr','gb2312','gbk',
'gb18030','hz','iso2022_jp','iso2022_jp_1','iso2022_jp_2',
'iso2022_jp_2004','iso2022_jp_3','iso2022_jp_ext','iso2022_kr','latin_1',
'iso8859_2','iso8859_3','iso8859_4','iso8859_5','iso8859_6',
'iso8859_7','iso8859_8','iso8859_9','iso8859_10','iso8859_11',
'iso8859_13','iso8859_14','iso8859_15','iso8859_16','johab',
'koi8_r','koi8_t','koi8_u','kz1048','mac_cyrillic',
'mac_greek','mac_iceland','mac_latin2','mac_roman','mac_turkish',
'ptcp154','shift_jis','shift_jis_2004','shift_jisx0213','utf_32',
'utf_32_be','utf_32_le','utf_16','utf_16_be','utf_16_le',
'utf_7','utf_8','utf_8_sig']
# Select the best decoder
items = []
for item in encode:
attach_size = msg.get_attachment(0).get_size()
content = (msg.get_attachment(0).read_buffer(attach_size)).decode(item, errors="ignore")
# I know 'sample_content' is in the attachment, so it's easy to see which ones can decode it.
if 'sample_content' in content:
items.append(item)
print(items)
If you don't know what's in the content, you can try workarounds. For instance, in the loop you can find one decoding that leaves least number of "\x", since before encoding your content looks like this "\x93\x93\xfa\x8c\xd3\x1a\xc6".
If anyone has better ways of decoding attachments, please leave a comment here, thank you.