13

Python supports a quite functional MIME-Library called email.mime.

What I want to achieve is to get a MIME Part containing plain UTF-8 text to be encoded as quoted printables and not as base64. Although all functionallity is available in the library, I did not manage to use it:

Example:

import email.mime.text, email.encoders
m=email.mime.text.MIMEText(u'This is the text containing ünicöde', _charset='utf-8')
m.as_string()
# => Leads to a base64-encoded message, as base64 is the default.

email.encoders.encode_quopri(m)
m.as_string()
# => Leads to a strange message

The last command leads to a strange message:

Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Transfer-Encoding: quoted-printable

GhpcyBpcyB0aGUgdGV4dCBjb250YWluaW5nIMO8bmljw7ZkZQ=3D=3D

This is obviously not encoded as quoted printables, the double transfer-encoding header is strange at last (if not illegal).

How can I get my text encoded as quoted printables in the mime-message?

theomega
  • 31,591
  • 21
  • 89
  • 127
  • 1
    See also http://stackoverflow.com/a/9509718/874188 -- the question is Python 3, but I have used it in Python 2 as well. – tripleee Jul 30 '15 at 04:26
  • 1
    For Python 3.6+ see also now https://stackoverflow.com/questions/66039715/python3-email-message-to-disable-base64-and-remove-mime-version/66041936#66041936 – tripleee Feb 04 '21 at 08:33
  • Similar to [Python send email with "quoted-printable" transfer-encoding and "utf-8" content-encoding](https://stackoverflow.com/q/31714221/471376) – JamesThomasMoon Mar 16 '22 at 22:32

3 Answers3

14

Okay, I got one solution which is very hacky, but at least it leads into some direction: MIMEText assumes base64 and I don't know how to change this. For this reason I use MIMENonMultipart:

import email.mime, email.mime.nonmultipart, email.charset
m=email.mime.nonmultipart.MIMENonMultipart('text', 'plain', charset='utf-8')

#Construct a new charset which uses Quoted Printables (base64 is default)
cs=email.charset.Charset('utf-8')
cs.body_encoding = email.charset.QP

#Now set the content using the new charset
m.set_payload(u'This is the text containing ünicöde', charset=cs)

Now the message seems to be encoded correctly:

Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

This is the text containing =C3=BCnic=C3=B6de

One can even construct a new class which hides the complexity:

class MIMEUTF8QPText(email.mime.nonmultipart.MIMENonMultipart):
  def __init__(self, payload):
    email.mime.nonmultipart.MIMENonMultipart.__init__(self, 'text', 'plain',
                                                      charset='utf-8')

    utf8qp=email.charset.Charset('utf-8')
    utf8qp.body_encoding=email.charset.QP

    self.set_payload(payload, charset=utf8qp) 

And use it like this:

m = MIMEUTF8QPText(u'This is the text containing ünicöde')
m.as_string()
Rob Bednark
  • 25,981
  • 23
  • 80
  • 125
theomega
  • 31,591
  • 21
  • 89
  • 127
8

In Python 3 you do not need your hack:

import email

# Construct a new charset which uses Quoted Printables (base64 is default)
cs = email.charset.Charset('utf-8')
cs.body_encoding = email.charset.QP

m = email.mime.text.MIMEText(u'This is the text containing ünicöde', 'plain', _charset=cs)

print(m.as_string())
Illia Somov
  • 117
  • 1
  • 7
  • 2
    to be fair the hack was needed in Python 2. Your answer only works with Python 3. So basically you could say the original issue can be solved by switching to Python 3. – Felix Schwarz Oct 01 '20 at 14:43
5

Adapted from issue 1525919 and tested on python 2.7:

from email.Message import Message
from email.Charset import Charset, QP

text = "\xc3\xa1 = \xc3\xa9"
msg = Message()

charset = Charset('utf-8')
charset.header_encoding = QP
charset.body_encoding = QP

msg.set_charset(charset)
msg.set_payload(msg._charset.body_encode(text))

print msg.as_string()

will give you:

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

=C3=A1 =3D =C3=A9

Also see this response from a Python committer.

Community
  • 1
  • 1
mmoya
  • 1,901
  • 1
  • 21
  • 30
  • I missed at first that the input to `body_encode` must already be utf-8 encoded, and that it doesn't do the utf-8 encoding for you. Noting this here in case it saves others the pain of the same misunderstanding. – new name Jun 18 '17 at 22:33