0

I can download the eml file using mime-content. I need to edit this eml file and delete attachments. I can look up attachment names. If I understand correctly, the first is the email header, the body, and then the attachments. I need advice on how to delete attachments from the body of an email.

import email
from email import policy
from email.parser import BytesParser
with open('messag.eml', 'rb') as fp:  # select a specific email file
    msg = BytesParser(policy=policy.default).parse(fp)
    text = msg.get_body(preferencelist=('plain')).get_content()
    print(text)  # print the email content
    for attachment in attachments:
        fnam=attachment.get_filename()
        print(fnam) #print attachment name
tripleee
  • 175,061
  • 34
  • 275
  • 318
  • https://stackoverflow.com/questions/1626403/python-email-lib-how-to-remove-attachment-from-existing-message is basically the same question for Python 2, but as the `email` API has changed considerably since then, I'm posting a new answer here, and leaving a pointer at the old question. – tripleee Nov 05 '21 at 10:20
  • Regarding understanding email messsage structures, probably refer to https://stackoverflow.com/questions/48562935/what-are-the-parts-in-a-multipart-email – tripleee Nov 05 '21 at 11:53

1 Answers1

3

The term "eml" is not strictly well-defined but it looks like you want to process standard RFC5322 (née 822) messages.

The Python email library went through an overhaul in Python 3.6; you'll want to make sure you use the modern API, like you already do (the one which uses a policy argument). The way to zap an attachment is simply to use its clear() method, though your code doesn't correctly fetch the attachments in the first place. Try this:

import email
from email import policy
from email.parser import BytesParser

with open('messag.eml', 'rb') as fp:  # select a specific email file
    msg = BytesParser(policy=policy.default).parse(fp)
    text = msg.get_body(preferencelist=('plain')).get_content()
    print(text)
    # Notice the iter_attachments() method
    for attachment in msg.iter_attachments():
        fnam = attachment.get_filename()
        print(fnam)
        # Remove this attachment
        attachment.clear()

with open('updated.eml', 'wb') as wp:
    wp.write(msg.as_bytes())

The updated message in updated.eml might have some headers rewritten, as Python doesn't preserve precisely the same spacing etc in all headers.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • that's how it works. The only problem is that there are empty txt files instead of attachments, but I still care about the size of the email as such. – Patrik Novotný Nov 05 '21 at 11:42
  • Not sure what you mean by that. If you have messages which do not have the prescribed structure, you probably want to put a condition in place to not modify them. – tripleee Nov 05 '21 at 11:52