6

I know that each part of a multipart email message can be a multipart itself. Are attachments added only as top-level parts, or can they be in a nested multipart as well?

For example of what I mean, here attachment1.doc is nested, while attachment2.doc would be a top-level part.

multipart/mixed
   |---Title: text/plain
   |---Text content: text/plain
   |---Nested multipart: multipart/mixed
   |      |--- attachment1.doc (BASE64)
   |---attachment2.doc (BASE64)

I'm asking because I encountered this code from https://stackoverflow.com/a/27556667/492336:

    # Iterate the different parts of the multipart message.
    for part in msg.walk():
        # Skip any nested multipart.
        if part.get_content_maintype() == 'multipart':
            continue

It's in Python, and they iterate through the different parts of the message to search for attachments, but skip any part that is itself a multipart.

Are they correct to do that? I tried reading the RFC3501, but couldn't find anything definitive saying whether file attachments can be or not be nested.

Community
  • 1
  • 1
sashoalm
  • 75,001
  • 122
  • 434
  • 781

2 Answers2

7

There is no prescription for limitations, and you would be hard pressed to argue for a single policy for all multipart types -- they have quite distinct purposes.

For example, with a message like

multipart/mixed
  +-- multipart/alternative
  |     +-- text/plain
  |     +-- multipart/related
  |           +-- text/html
  |           +-- image/png
  |           +-- image/png
  +-- application/octet-stream; name="attachment.pdf"

... the sane behavior for most clients which want to provide an HTML view of the message would be to pick the multipart/related inside multipart/alternative with all its attachments, and use that for displaying the message, while displaying the PDF as a separate attachment. If you only process the top-level multipart/mixed you only see the attachment, which doesn't seem like a sane approach.

Another case where completely arbitrary nesting can occur is message/rfc822 where the attached message is a complete MIME message of its own, which might in turn contain another message/rfc822, etc recursively.

Anything with an (explicit or implied) Content-Disposition: attachment is an "attachment"; you do sometimes see "attachments" inside e.g. multipart/alternative which would imply that the attachment only makes sense if you are displaying that alternative view of the message -- I am hard pressed to come up with an example where this would be true, and might actually speculate that it should be regarded as an error, and display the attachment when rendering another alternative, just in case.

As a belated addendum, the Python code is correct; it bypasses the containers but still examines their contents. Compare to a file search where you would not search directories themselves for your search text, but still examine the actual files within. A multipart MIME part by itself only contains other MIME parts.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Doesn't it depend on the email client application that was used to compose the message? In that case, Thunderbird, Outlook, and GMail's behavior should cover 99% of the cases... Maybe there is a de-factor standard, who knows. The only files I've found so far that were nested were images meant to be displayed in the email itself. Oh, and thanks for the answer! – sashoalm Nov 13 '15 at 09:11
  • Outlook may be very common, but adapting to its behavior is not something I would do lightheartedly. Sadly, Thunderbird and Gmail also show stark disregard for the standards in some places. If you do want to cover the common cases, maybe add Apple's Mail.app to your list. – tripleee Nov 13 '15 at 09:15
2

Nested multiparts are legal, and common for a few use cases. Most importantly, if you use S/MIME to sign a multipart message containing text and a picture, you'll typically have a top-level multipart/signed containing a multipart/mixed and some other parts, and the multipart/mixed in turn contains a text/plain and an image/jpeg.

arnt
  • 8,949
  • 5
  • 24
  • 32
  • Thanks. Do you know where it is described in the RFC's? I assume it's in https://tools.ietf.org/html/rfc2045#section-2.3 somewhere. – sashoalm Nov 13 '15 at 09:14
  • 2045 says multiparts contain zero or more parts. That's it. There aren't any relevant restrictions, therefore wrapping multiparts in multiparts is allowed. (All S/MIME does in this context is to offer a relevant example.) – arnt Nov 13 '15 at 10:40