I want to decode 'quoted-printable' encoded strings in Python, but I seem to be stuck at a point.
I fetch certain mails from my gmail account based on the following code:
import imaplib
import email
import quopri
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('mail@gmail.com', '*******')
mail.list()
mail.select('"[Gmail]/All Mail"')
typ, data = mail.search(None, 'SUBJECT', '"{}"'.format('123456'))
data[0].split()
print(data[0].split())
for e_mail in data[0].split():
typ, data = mail.fetch('{}'.format(e_mail.decode()),'(RFC822)')
raw_mail = data[0][1]
email_message = email.message_from_bytes(raw_mail)
if email_message.is_multipart():
for part in email_message.walk():
if part.get_content_type() == 'text/plain':
if part.get_content_type() == 'text/plain':
body = part.get_payload()
to = email_message['To']
utf = quopri.decodestring(to)
text = utf.decode('utf-8')
print(text)
.
.
.
If I print 'to'
for example, the result is this if the 'to' has characters like é,á,ó...:
=?UTF-8?B?UMOpdGVyIFBldMWRY3o=?=
I can decode the 'body
' quoted-printable encoded string successfully using the quopri library as such:
quopri.decodestring(sometext).decode('utf-8')
But the same logic doesn't work for other parts of the e-mail, such as the to, from, subject.
Anyone knows a hint?