0

I am loggin into a server with my credentials and I search for the latest email's attached file. I could read all information without any issues, such as 'FROM', 'TO', 'SUBJECT' and so on. One thing I could not understand is that file name of attached file could not be parsed correctly. The file name is should looks like '13079.xls', but it return a string like '=?utf-8?B?MTMwNzKueGxz?='.

Any suggestion?

import time
import imaplib
import base64
import os
import email
import glob
import csv

email_user = 'xxxxxxx@pangeare.com'
email_pass = 'xxxxxxxxxx'
host='imap.gmail.com'
port=993

mail = imaplib.IMAP4_SSL(host,port)
mail.login(email_user, email_pass)
mail.select('Inbox')
type, data = mail.search(None, 'ALL')

for num in data[0].split()[::-1]:
    typ, data = mail.fetch(num, '(RFC822)' )
    raw_email = data[0][1]
    raw_email_string = raw_email.decode('ascii')
    email_message = email.message_from_string(raw_email_string)
    print( email_message['To'])
    print( email_message['Subject'])
    print (email.utils.parseaddr(email_message['From']))
    for part in email_message.walk():
        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-Disposition') is None:
            continue
        fileName = part.get_filename()
        print(filename)
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Yang L
  • 439
  • 3
  • 14
  • A better [mre] might include a minimal sample `raw_email` (specifically, the shortest possible sample email message that produces the same problem) as an input instead of having this only be testable by someone who has working credentials to your gmail inbox and the same email inside it. – Charles Duffy Jul 28 '21 at 17:19
  • 1
    `MTMwNzKueGxz` is base64 for `13072xls`, so it looks like it's encoded. You can do something [like this](https://stackoverflow.com/a/3470583) to decode it. – Henry Jul 28 '21 at 17:36
  • 1) use message_from_bytes. 2) use a newer policy, rather then the default compat32 layer, and it will do the decoding for you. Eg: email.message_from_bytes(raw_email, policy=email.policy.default). For backwards compatibility, the compat32 (the default) doesn’t do decoding, because Python 3.2 didn’t. – Max Jul 28 '21 at 18:33
  • 1
    See https://docs.python.org/3/library/email.policy.html#email.policy.default for details on email policies, which change how parsing occurs. Old examples just make things too complicated) – Max Jul 28 '21 at 18:34
  • The suggestion: https://github.com/ikvk/imap_tools – Vladimir Jul 29 '21 at 03:38

0 Answers0