0

I'm working with my first ever Python project and I've been stuck for a while now. It's a program that reads my email from a specific email folder and saves the message down to a csv file. The problem is when someone includes an image or attaches something the program saves the image code instead of the message.

def checkmail():
    try:
        # create an IMAP4 class with SSL
        imap = imaplib.IMAP4_SSL("imap.gmail.com")
        imap.login(username, password)
        imap.select("FromB") #Inbox to use
        result, data = imap.uid('search', None, "UNSEEN")  # (ALL/UNSEEN)
        inbox_item_list = data[0].split()

        most_recent = inbox_item_list[-1]

        for item in inbox_item_list:
            result2, email_data = imap.uid('fetch', most_recent, '(RFC822)')
            raw_email = email_data[0][1].decode("utf-8")
            email_message = email.message_from_string(raw_email)
            date_ = email_message['date'].format(time.strftime("%d-%b-%Y"))
            counter = 1
            for part in email_message.walk():
                if part.get_content_maintype() == "multipart":
                    continue
                filename = part.get_filename()
                content_type = part.get_content_type()

                if not filename:
                    ext = mimetypes.guess_extension(part.get_content_type())
                    if not ext:
                        ext = '.bin'
                    if 'text' in content_type:
                        ext = '.txt'
                    filename = 'msg-part-%08d%s' %(counter, ext)
                counter += 1

            save_path = os.path.join(os.getcwd(), "jobb")
            if not os.path.exists(save_path):
                os.makedirs(save_path)
            with open(os.path.join(save_path, filename), 'wb') as fp:
                fp.write(part.get_payload(decode=True))


                #print(content_type)
                if "plain" in content_type:
                    print("z")
                    #print(part.get_payload())
                elif "html" in content_type:
                    html_ = part.get_payload(decode=True)
                    charset = part.get_content_charset('iso-8859-1')
                    chars = html_.decode(charset, 'replace')
                    soup = BeautifulSoup(chars, "html.parser")
                    text_mail = soup.get_text()
                    print("En ny csv sparad")
                else:
                    # Here I have just a print function earlier and everytime 
                    # there was an image the program printed the print function. 
                    # So I figured what if I instead put in the code that saves
                    # the text when there is no image but that didn't work. And 
                    # now I'm stuck I don't know how to solve this and would 
                    # like guidance.
                    html_ = part.get_payload(decode=True)
                    charset = part.get_content_charset('iso-8859-1')
                    chars = html_.decode(charset, 'replace')
                    soup = BeautifulSoup(chars, "html.parser")
                    text_mail = soup.get_text()
                    print("One new csv saved")

        testing123 = {'text': [text_mail],
                      'date': [date_]
                     }

        df = pd.DataFrame(testing123, columns=['text', 'date'])
        dt = datetime.today()
        date_today = dt.timestamp()
        df.to_csv("D:\APMPYTHON\(%s).csv" % date_today)
    except:
        print("No new email")
martineau
  • 119,623
  • 25
  • 170
  • 301
user2227874
  • 31
  • 1
  • 1
  • 9
  • It looks to me like your code would save any attachments *and* ... somehow attempt to save any HTML as CSV, except that code path doesn't actually save anything. Is the problem simply that you forgot to add code to write out the message text? Why do you save other file types if you don't want them? – tripleee Aug 08 '20 at 16:47
  • Your IMAP code has some issues too, you just want to get the bytes without decoding them and then use `email.message_from_bytes` instead of `message_from_string`. With modern Python, you want to pass in `policy=email.policy.default` to get the new and impoved version of the `email` library which includes a convenient method to pull out what it considers to be the "main" part of the message, so you can process only that. – tripleee Aug 08 '20 at 16:51
  • The blanket `except` is also a huge bug; see https://stackoverflow.com/questions/54948548/what-is-wrong-with-using-a-bare-except – tripleee Aug 08 '20 at 19:44
  • try to use https://github.com/ikvk/imap_tools – Vladimir Aug 12 '20 at 07:44

0 Answers0