-1

My setup uses fetchmail to pull emails from Gmail, which are processed by procmail and passes it to a python script.

When I use email.message_from_string(), the resulting object is not parsed as an email object. get_payload() returns the header/body/payload text of the email as a single text blob.

This is the text it returns:

From example@gmail.com  Sat Aug 17 19:20:44 2013
>From example  Sat Aug 17 19:20:44 2013
MIME-Version: 1.0
Received: from ie-in-f109.1e100.net [74.125.142.109]
    by VirtualBox with IMAP (fetchmail-6.3.21)
    for <example@localhost> (single-drop); Sat, 17 Aug 2013 19:20:44 -0700 (PDT)
Received: by 10.70.131.110 with HTTP; Sat, 17 Aug 2013 19:20:42 -0700 (PDT)
Date: Sat, 17 Aug 2013 19:20:42 -0700
Delivered-To: example@gmail.com
Message-ID: <CAAsp4m0GBeVg80-ryFgNvNNAj_QPguzbX3DqvMSx-xSGZM18Pw@mail.gmail.com>
Subject: test 19:20
From: example <example@gmail.com>
To: example <example@gmail.com>
Content-Type: multipart/alternative; boundary=001a1133435474449004e42f7861

--001a1133435474449004e42f7861
Content-Type: text/plain; charset=ISO-8859-1

19:20

--001a1133435474449004e42f7861
Content-Type: text/html; charset=ISO-8859-1

<div dir="ltr">19:20</div>

--001a1133435474449004e42f7861--

My code:

full_msg = sys.stdin.read()
msg = email.message_from_string(full_msg)
msg['to']          # returns None
msg.get_payload()  # returns the text above

What am I missing to get Python to properly interpret the email?

I see from these questions that I may not be getting the proper email headers somewhere along the line, but I cannot confirm. That ">" on line 2 is not a typo: it's in the text.

Community
  • 1
  • 1
schroeder
  • 533
  • 1
  • 7
  • 25

1 Answers1

1

Regardless of ">" being "in the text" as you say, whatever that means - it's wrong. After removing this character:

>python test.py <input.txt
example <example@gmail.com>
[<email.message.Message instance at 0x02810288>, <email.message.Message instance at 0x02810058>]

So the error is not in parsing the message, but in the ">" character somehow corrupting your email text.

BartoszKP
  • 34,786
  • 15
  • 102
  • 130
  • Don't know what's adding that "<", but once I delete it in the script, the parser works fine. The entire system now works as it should. Thanks for the idea. – schroeder Aug 18 '13 at 23:33