How to extract an email body from a file using email.Parser?

Question

I am trying to use python and email.Parser to parse an email from a file. I use the following command

headers = Parser().parse(open(filename, 'r'))

to parse the file. But when I try to get the body I use e.g.

print(headers.get_payload()[0])

and I get something like

From nobody Mon Oct 12 16:32:25 2015
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Alex,
....

Is there some way to get rid of those first three/four lines? And how to decode content like 'fr=C3=BCher'?

score 3 · Answer 1 · answered Oct 13 '15 at 07:20

To get to the message body, you have to walk() it's different parts, i.e.:

a = email.message_from_file(open(filename, 'r')) #shorthand for Parser().parse
body = ''

if a.is_multipart():
   for part in b.walk():
       ctype = part.get_content_type()
       cdispo = str(part.get('Content-Disposition'))

       # skip any text/plain (txt) attachments
       if ctype == 'text/plain' and 'attachment' not in cdispo:
           body = part.get_payload(decode=True)  # decode
           break
# not multipart - i.e. plain text, no attachments
else:
    body = b.get_payload(decode=True)

The decode=True in get_payload() does the base64/etc decoding, i.e the 'fr=C3=BCher' strings

sure thing; you can check a little longer rant by me on this topic, in the question Anurag has linked — Todor Minakov, Oct 13 '15 at 07:33

score 0 · Answer 2 · edited May 23 '17 at 12:14

0

Use Message.get_payload

Check this answer Python : How to parse the Body from a raw email , given that raw email does not have a "Body" tag or anything

edited May 23 '17 at 12:14

Community

1
1

answered Oct 12 '15 at 15:08

Anurag Verma

485
2
12

How to extract an email body from a file using email.Parser?

2 Answers2