How to parse only text data from an email?
There are multiple types of emails from inbox.
Only text
Blank email
Text with html
html with attachment
text with attachment
How to identify the email type and then extract emails with only text.
I have created a function to loop through emails, but I'm stuck at extracting the body for few emails. I'm getting the below error on this line :
email.message_from_string(response_part[1].decode('utf-8'))
'utf-8' codec can't decode byte 0xa0 : invalid start byte
Function to loop through emails
def read_email:
try:
mail = imaplib.IMAP4_SSL(SMTP_SERVER)
mail.login(FROM_EMAIL,FROM_PWD)
mail.select('inbox')
type, data = mail.search(None, 'ALL')
mail_ids = data[0]
id_list = mail_ids.split()
for i in reversed(id_list):
typ, data = mail.fetch(i, '(RFC822)')
for response_part in data:
if isinstance(response_part, tuple):
msg = email.message_from_string(response_part[1].decode('utf-8'))
email_subject = msg['subject']
email_from = msg['from']
email_to = msg['to']
emailid = msg['message-id']
except Exception as e:
print(str(e))