0

The following code uses imap to find emails by subject line and returns all parts of the email and downloads the attachments. However i am ONLY needing it to download the attachments of the email not the entire body also. I understand this has to do with the for part in email_message.walk(): that is iterating the entire email. Could someone please help me have this code download only the attachment of the email? Im sure this is a simple code change but im just not sure how to make it!

import imaplib
import email.header
import os
import sys
import csv
# Your IMAP Settings
host = 'imap.gmail.com'
user = 'User email'
password = 'User password'

# Connect to the server
print('Connecting to ' + host)
mailBox = imaplib.IMAP4_SSL(host)

# Login to our account
mailBox.login(user, password)

boxList = mailBox.list()
# print(boxList)

mailBox.select()
searchQuery = '(SUBJECT "CDR Schedule output from schedule: This is a test to see how it works")'

result, data = mailBox.uid('search', None, searchQuery)
ids = data[0]
# list of uids
id_list = ids.split()

i = len(id_list)
for x in range(i):
    latest_email_uid = id_list[x]

    # fetch the email body (RFC822) for the given ID
    result, email_data = mailBox.uid('fetch', latest_email_uid, '(RFC822)')
    # I think I am fetching a bit too much here...

    raw_email = email_data[0][1]

    # converts byte literal to string removing b''
    raw_email_string = raw_email.decode('utf-8')
    email_message = email.message_from_string(raw_email_string)

    # downloading attachments
    for part in email_message.walk():

        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-Disposition') is None:
            continue
        fileName = part.get_filename()

        if bool(fileName):
            filePath = os.path.join('C:/install files/', fileName)
            if not os.path.isfile(filePath) :
                fp = open(filePath, 'wb')
                fp.write(part.get_payload(decode=True))
                fp.close()



    subject = str(email_message).split("Subject: ", 1)[1].split("\nTo:", 1)[0]
    print('Downloaded "{file}" from email titled "{subject}" with UID {uid}.'.format(file=fileName, subject=subject, uid=latest_email_uid.decode('utf-8')))

mailBox.close()
mailBox.logout()
Michael Butscher
  • 10,028
  • 4
  • 24
  • 25
  • Related: https://stackoverflow.com/a/27556667/3613640 – Josh Cooley Mar 25 '20 at 15:20
  • Thank you for your reply. I actually looked at that code to create this code. As that code does not search emails by subject titles as this one does. The part that downloads the attachment in both this code and the one you provided seem nearly identical. Could you show me which part of my code is incorrectly returning the entire email instead of just the attachment, and then show me how to fix it to only download the attachment? – Garrett Kidd Mar 25 '20 at 15:41
  • 1
    Email senders aren't awfully disciplined, and will send mail that bothers your code, whatever your code does. So you might as well start a corpus of test messages now, because you'll have to do it sooner or later anyway. The first message should be one the code already mishandles, and you can post that message here, so that your question isn't "my code handles an unespecified case badly, please help". – arnt Mar 25 '20 at 15:52
  • I appreciate the response. Here is the error message that my code shows. The reasoning it shows it is because the body of the email has HTML in it. `IOError: [Errno 22] invalid mode ('wb') or filename: 'C:/install files/This is a test to see how it\r\n works_lhac.com_2020-03-17 13:13:43_2020-03-24\r\n 18:13:43_Advanced__lhac.com.csv'` For reference, the email I am trying to get only the attachment from, is a weekly automated email that sends a CSV attachment. However, that email body has HTML in it, and is returning an error code. Hope this is what you mean. – Garrett Kidd Mar 25 '20 at 15:58
  • I know the issue is with the HTML body of the email, because I sent the exact same attachment in a plaintext format to myself, and used that as a test and it worked perfectly. – Garrett Kidd Mar 25 '20 at 15:59
  • 1
    Sorry, meant to link this answer. I believe this applies to your issue: "change mail = email.message_from_string(email_body) in downloaAttachmentsInEmail to mail = email.message_from_bytes(email_body)". https://stackoverflow.com/a/59650617/3613640 – Josh Cooley Mar 25 '20 at 16:35
  • I just tested that code exactly, making the changes that were outlined in that post, changing from string to bytes. The error still occurs whenever there is HTML in the email body. That code works just fine whenever the email is plaintext. Any ideas? – Garrett Kidd Mar 25 '20 at 16:50
  • Any updates for this? – Garrett Kidd Mar 25 '20 at 18:57

0 Answers0