6

I'm trying to make a Python program that retrieves only the body text of an email without passing headers or any other parameters. I'm not sure how to go about this.

The goal is to be able to send basic commands to a program via message text.

What I have now is this:

import poplib

host = "pop.gmail.com"
mail = poplib.POP3_SSL(host)
print mail.getwelcome()
print mail.user("user")
print mail.pass_("pass")
print mail.stat()
print mail.list()
print ""

if mail.stat()[1] > 0:
    print "You have new mail."
else:
    print "No new mail."

print ""

numMessages = len(mail.list()[1])
for i in range(numMessages):
    for j in mail.retr(i+1)[1]:
        print j

mail.quit()
input("Press any key to continue.")

Which is all fine, except when "print J" is executed it prints the entire message, including headers. I just want to extract the body text without any additional garbage.

Can anyone help? Thanks!

mplewis
  • 464
  • 1
  • 8
  • 18
  • 1
    WOW! Is that all it takes to download email in Python? I've been trying to do this in C# for months!!!! And not one of the 3rd party components work in C#!........ I don't know Python, but remember seeing the same problem somewhere, I'm looking for the website now and if i can still find it, ill post that persons solution in here for you... if they've found the solution – jay_t55 Feb 07 '10 at 20:03
  • Python makes everything seem like a trivial task ;) – Ospho Jan 29 '14 at 21:40

3 Answers3

5

I would use email module to get the body of email message with get_payload() method which skips header info.

I added few lines to your code (they are marked with # new statement at the end of line)

import poplib
import email # new statement

host = "pop.gmail.com"
mail = poplib.POP3_SSL(host)
print mail.getwelcome()
print mail.user("user")
print mail.pass_("pass")
print mail.stat()
print mail.list()
print ""

if mail.stat()[1] > 0:
    print "You have new mail."
else:
    print "No new mail."

print ""

numMessages = len(mail.list()[1])
for i in range(numMessages):
    for j in mail.retr(i+1)[1]:
        #print j
        msg = email.message_from_string(j) # new statement
        print(msg.get_payload()) # new statement

mail.quit()
input("Press any key to continue.")
Chaos Manor
  • 1,160
  • 1
  • 16
  • 17
  • i get the error 'TypeError: initial_value must be str or None, not bytes' when trying to use your line msg = email.message_from_string(j) – Tintinabulator Zea Jan 02 '18 at 00:29
  • @Tintinabulator Zea - This should happen because you are using **Python3**. Use `msg = email.message_from_bytes(j)` instead of `msg = email.message_from_string(j)`. See [Python Email Parsing Issue](https://stackoverflow.com/questions/19508393/python-email-parsing-issue/19508543) too. – Chaos Manor Jan 16 '18 at 06:53
4

This is a fragment of code from my own POP3 reader:

        response, lines, bytes = pop.retr(m)

        # remove trailing blank lines from message
        while lines[-1]=="": 
            del lines[-1]

        try:
            endOfHeader = lines.index('')
            header = lines[:endOfHeader]
            body = lines[endOfHeader+1:]
        except ValueError:
            header = lines
            body = []

This keys off the first empty line in the list of all lines as the end of the header info. Then just list slice from there to the end for the message body.

PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • Thanks for the help! I'm having a little trouble though. What does m represent in pop.retr(m)? It's throwing errors and I don't know what to put there, or if this routine needs to reside in a specific sub. – mplewis Feb 08 '10 at 06:55
  • `pop.retr(m)` in my code is analogous to `mail.retr(i+1)` in your code - m is the message number, and is an integer. See how I unpack the tuple returned by retr, while you just take the [1]th element? In your code, just iterate over lines until you hit an empty string, or you run out of lines. The rest is the body. – PaulMcG Feb 08 '10 at 12:14
2

You can parse eMails using the email module.

ebo
  • 8,985
  • 3
  • 31
  • 37