Python smtplib and sanitization

Question

I'm using a very basic (almost exactly from the docs) utilization of smtplib. A subject and message are retrieved from a Bottle FormsDict using request.forms.get() and then emailed off using this code.

msg = MIMEText(message)
msg['Subject'] = subject
msg['From'] = config['from_email']
msg['To'] = config['to_email']
s = smtplib.SMTP('localhost')
s.sendmail(config['from_email'], [config['to_email']], msg.as_string())
s.quit()

I'm used to sanitizing user input for XSS and such (usually just relying on Jinja2's magic). What should I be doing in this case though, where I'm only sending the user's input through email? What sort of vulnerabilities would there be?

score 2 · Answer 1 · edited May 23 '17 at 12:21

I'm trying to figure out that myself right now and one thing I can tell you is that you should definitely use email.header.Header to detect header injections:

from email.header import Header
>>> Header('Test').encode()
'Test'
>>> Header('Test\n').encode()
'Test'
>>> Header('Test\nTest2').encode()
'Test\nTest2'
>>> Header('Test\nFrom').encode()
'Test\nFrom'
>>> Header('Test\nFrom:').encode()
(...)
HeaderParseError: header value appears to contain an embedded header: 'Test\nFrom:'

Also check this answer, I think I agree that if the input is potentially dangerous you should just reject it as that probably means someone is trying to do something sketchy.

EDIT:

It turns out that MIME messages validate headers on their own, even if you don't use email.header.Header, and also nicely encode the body:

>>> msg = MIMEText('something\r\nsomething2', 'plain', 'UTF-8')
>>> msg.as_string()
'MIME-Version: 1.0\nContent-Type: text/plain; charset="utf-8"\nContent-Transfer-Encoding: base64\n\nc29tZXRoaW5nDQpzb21ldGhpbmcy\n'
>>> msg['From'] = 'me@localhost\r\nSubject: injected subject'
>>> msg.as_string()
(...)
HeaderParseError: header value appears to contain an embedded header: 'me@localhost\nSubject: injected subject'

You can find more possible injections at Is there any injection vulnerability in the body of an email?.

So I would say that you don't have to do anything special to stay on the safe side as:

header injection is detected by default, so a hacker cannot add headers by messing with the value of subject
body is encoded/quoted so if there's any evil sequence of charatecter that could break something it should be neutralized
even if body wasn't encoded, I think the only way to cause harm would be to change the message structure by e.g. injecting MIME boundary (for multipart messages; example: https://bugzilla.mozilla.org/show_bug.cgi?id=600464). But the boundary is a long random string so it would probably be easier to guess your bank password.

<CRLF>.<CRLF> sequence terminates message body but that is not a problem as MIME classes replace CRLF with LF:

>>> MIMEText('something\r\n.\r\nsomething2', 'plain', _charset='iso-8859-1').as_string()
'Content-Type: text/plain; charset="iso-8859-1"\nMIME-Version: 1.0\nContent-Transfer-Encoding: quoted-printable\n\nsomething\n.\nsomething2'

The `.` thing is a red herring. Like it says in the answer you link to, it's only a problem if you are writing your own SMTP client. Above that level, it's handled transparently. — tripleee, Sep 16 '14 at 17:32
@tripleee Now I'm puzzled by how MIMEText sanitizes the input. According to http://tools.ietf.org/html/rfc5322#section-2.3 CR and LF should always go together, whereas MIMEText coverts CRLF to LF.. — Tomasz Zieliński, Sep 16 '14 at 17:38
Apples and oranges. On the SMTP level all line terminators get changed to CRLF in transport. Dot stuffing is not the realm or responsibility of the MIME layer at all; it's a concern during the SMTP transaction only. — tripleee, Sep 16 '14 at 17:51

Python smtplib and sanitization

1 Answers1