31

I'd like to fetch the whole message from IMAP4 server. In python docs if found this bit of code that works:

>>> t, data = M.fetch('1', '(RFC822)')
>>> body = data[0][1]

I'm wondering if I can always trust that data[0][1] returns the body of the message. When I've run 'RFC822.SIZE' I've got just a string instead of a tuple.

I've skimmed through rfc1730 but I wasn't able to figure out the proper response structure for the 'RFC822'. It is also hard to tell the fetch result structure from imaplib documentation.

Here is what I'm getting when fetching RFC822:

('OK', [('1 (RFC822 {858569}', 'body of the message', ')')])

But when I fetch RFC822.SIZE I'm getting:

('OK', ['1 (RFC822.SIZE 847403)'])

How should I properly handle the data[0] list? Can I trust that when it is a list of tuples the tuples has exactly 3 parts and the second part is the payload?

Maybe you know any better library for imap4?

Robert Siemer
  • 32,405
  • 11
  • 84
  • 94
Piotr Czapla
  • 25,734
  • 24
  • 99
  • 122

4 Answers4

40

No... imaplib is a pretty good library, it's imap that's so unintelligible.

You may wish to check that t == 'OK', but data[0][1] works as expected for as much as I've used it.

Here's a quick example I use to extract signed certificates I've received by email, not bomb-proof, but suits my purposes:

import getpass, os, imaplib, email
from OpenSSL.crypto import load_certificate, FILETYPE_PEM

def getMsgs(servername="myimapserverfqdn"):
  usernm = getpass.getuser()
  passwd = getpass.getpass()
  subject = 'Your SSL Certificate'
  conn = imaplib.IMAP4_SSL(servername)
  conn.login(usernm,passwd)
  conn.select('Inbox')
  typ, data = conn.search(None,'(UNSEEN SUBJECT "%s")' % subject)
  for num in data[0].split():
    typ, data = conn.fetch(num,'(RFC822)')
    msg = email.message_from_string(data[0][1])
    typ, data = conn.store(num,'-FLAGS','\\Seen')
    yield msg

def getAttachment(msg,check):
  for part in msg.walk():
    if part.get_content_type() == 'application/octet-stream':
      if check(part.get_filename()):
        return part.get_payload(decode=1)

if __name__ == '__main__':
  for msg in getMsgs():
    payload = getAttachment(msg,lambda x: x.endswith('.pem'))
    if not payload:
      continue
    try:
      cert = load_certificate(FILETYPE_PEM,payload)
    except:
      cert = None
    if cert:
      cn = cert.get_subject().commonName
      filename = "%s.pem" % cn
      if not os.path.exists(filename):
        open(filename,'w').write(payload)
        print "Writing to %s" % filename
      else:
        print "%s already exists" % filename
MattH
  • 37,273
  • 11
  • 82
  • 84
  • Good to know that this works for you. But any thoughts why it works as described? – Piotr Czapla Feb 09 '10 at 16:28
  • The return values are the tokenized IMAP server response. – MattH Feb 09 '10 at 17:38
  • Presumably higher-level imap libraries need to deal with foibles between different imap implementations, or be incompatible. – MattH Feb 09 '10 at 17:41
  • 1
    I am currently experiencing that `data[0]` is actually just a `bytes` object and not a tuple of `(bytes, bytes)`. My application continuously polls for new (unseen) messages from the IMAP server and this behaviour occurs when I mark the message as unread from the web interface. The service is at http://web.de/. More specifically, usually the data format is `[(bytes, bytes), bytes]` but when the message is marked as unseen manually, the format is `[bytes, (bytes, bytes), bytes]` – Niklas R Sep 08 '15 at 23:29
  • What if i want to read forwarded email body? – Amogh Katwe Jun 11 '21 at 18:46
14

The IMAPClient package is a fair bit easier to work with. From the description:

Easy-to-use, Pythonic and complete IMAP client library.

Robert Siemer
  • 32,405
  • 11
  • 84
  • 94
Peter Hansen
  • 21,046
  • 5
  • 50
  • 72
  • 1
    I support that. IMAPClient is very to use and object oriented. It is much easier to use than imaplib and has no major issues. – zoobert Jun 27 '12 at 14:55
7

Try my package: https://pypi.org/project/imap-tools/

example:

from imap_tools import MailBox

# get list of email bodies from INBOX folder
with MailBox('imap.mail.com').login('test@mail.com', 'password', 'INBOX') as mailbox:
    bodies = [msg.text or msg.html for msg in mailbox.fetch()]

Features:

  • Basic message operations: fetch, uids, numbers

  • Parsed email message attributes

  • Query builder for search criteria

  • Actions with emails: copy, delete, flag, move, append

  • Actions with folders: list, set, get, create, exists, rename, subscribe, delete, status

  • IDLE commands: start, poll, stop, wait

  • Exceptions on failed IMAP operations

  • No external dependencies, tested

Vladimir
  • 6,162
  • 2
  • 32
  • 36
4

This was my solution to extract the useful bits of information. It's been reliable so far:

import datetime
import email
import imaplib
import mailbox


EMAIL_ACCOUNT = "your@gmail.com"
PASSWORD = "your password"

mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(EMAIL_ACCOUNT, PASSWORD)
mail.list()
mail.select('inbox')
result, data = mail.uid('search', None, "UNSEEN") # (ALL/UNSEEN)
i = len(data[0].split())

for x in range(i):
    latest_email_uid = data[0].split()[x]
    result, email_data = mail.uid('fetch', latest_email_uid, '(RFC822)')
    # result, email_data = conn.store(num,'-FLAGS','\\Seen') 
    # this might work to set flag to seen, if it doesn't already
    raw_email = email_data[0][1]
    raw_email_string = raw_email.decode('utf-8')
    email_message = email.message_from_string(raw_email_string)

    # Header Details
    date_tuple = email.utils.parsedate_tz(email_message['Date'])
    if date_tuple:
        local_date = datetime.datetime.fromtimestamp(email.utils.mktime_tz(date_tuple))
        local_message_date = "%s" %(str(local_date.strftime("%a, %d %b %Y %H:%M:%S")))
    email_from = str(email.header.make_header(email.header.decode_header(email_message['From'])))
    email_to = str(email.header.make_header(email.header.decode_header(email_message['To'])))
    subject = str(email.header.make_header(email.header.decode_header(email_message['Subject'])))

    # Body details
    for part in email_message.walk():
        if part.get_content_type() == "text/plain":
            body = part.get_payload(decode=True)
            file_name = "email_" + str(x) + ".txt"
            output_file = open(file_name, 'w')
            output_file.write("From: %s\nTo: %s\nDate: %s\nSubject: %s\n\nBody: \n\n%s" %(email_from, email_to,local_message_date, subject, body.decode('utf-8')))
            output_file.close()
        else:
            continue
Edward Chapman
  • 509
  • 6
  • 8
  • No, use `email_from_bytes` instead of *hoping* that converting the raw email body to a string doesn't do unexpected things; the extra roundtrip is unnecessary and wasteful anyway. – tripleee Oct 17 '20 at 09:56