I'm trying to write a python script to read my emails.
I'm able to get most of the things properly like To
, From
, Subject
.
But in the body
, I get the text as well as it's HTML code too as shown below.
Below is the part of code that does the extraction of content from the email
email_message = email.message_from_string(raw_email)
print 'To:', email_message['To']
print 'Sent from:', email_message['From']
print 'Date:', email_message['Date']
print 'Subject:', email_message['Subject']
print '*'*30, 'MESSAGE', '*'*30
maintype = email_message.get_content_maintype()
#print maintype
if maintype == 'multipart':
for part in email_message.get_payload():
if part.get_content_maintype() == 'text':
print part.get_payload()
elif maintype == 'text':
print email_message.get_payload()
print '*'*69
Git link for the complete code: Email-parser
How to get rid of that HTML code and get only the plain text?