20

I am using Python module MimeWriter to construct a message and smtplib to send a mail constructed message is:

file msg.txt:
-----------------------
Content-Type: multipart/mixed;
from: me<me@abc.com>
to: me@abc.com
subject: 主題

Content-Type: text/plain;charset=utf-8

主題

I use the code below to send a mail:

import smtplib
s=smtplib.SMTP('smtp.abc.com')
toList = ['me@abc.com']
f=open('msg.txt') #above msg in msg.txt file
msg=f.read()
f.close()
s.sendmail('me@abc.com',toList,msg)

I get mail body correctly but subject is not proper,

subject: some junk characters

主題           <- body is correct.

Please suggest? Is there any way to specify the decoding to be used for the subject also, as being specified for the body. How can I get the subject decoded correctly?

Makoto
  • 104,088
  • 27
  • 192
  • 230
Rakesh
  • 271
  • 1
  • 2
  • 11

3 Answers3

36

From http://docs.python.org/library/email.header.html

from email.message import Message
from email.header import Header
msg = Message()
msg['Subject'] = Header('主題', 'utf-8')
print msg.as_string()

Subject: =?utf-8?b?5Li76aGM?=

more simple:

from email.header import Header
print Header('主題', 'utf-8').encode()

=?utf-8?b?5Li76aGM?=

as complement decode may made with:

from email.header import decode_header
a = decode_header("""=?utf-8?b?5Li76aGM?=""")[0]
print(a[0].decode(a[1]))

Reference: Python - email header decoding UTF-8

Sérgio
  • 6,966
  • 1
  • 48
  • 53
  • 1
    Note that this uses the [`email.message.Message`](https://docs.python.org/3/library/email.compat32-message.html#compat32-message) API, which was superseded by `email.message.EmailMessage` in Python 3.6. With the new API you must assign a string: `msg['Subject'] = 'unicode string'`, as [assigning Header objects is not supported](https://bugs.python.org/issue21095). In my experience as of 3.7.3 the "legacy" API works better - some encoding bugs are fixed in 3.8 – Nickolay Jun 21 '19 at 10:41
  • Thanks for head up – Sérgio Jun 21 '19 at 17:21
8

The subject is transmitted as an SMTP header, and they are required to be ASCII-only. To support encodings in the subject you need to prefix the subject with whatever encoding you want to use. In your case, I would suggest prefix the subject with ?UTF-8?B? which means UTF-8, Base64 encoded.

In other words, I believe your subject header should more or less look like this:

Subject: =?UTF-8?B?JiMyMDAyNzsmIzM4OTg4Ow=?=

In PHP you could go about it like this:

// Convert subject to base64
$subject_base64 = base64_encode($subject);
fwrite($smtp, "Subject: =?UTF-8?B?{$subject_base64}?=\r\n");

In Python:

import base64
subject_base64 = base64.encodestring(subject).strip()
subject_line = "Subject: =?UTF-8?B?%s?=" % subject_base64
Jimm Chen
  • 3,411
  • 3
  • 35
  • 59
AHM
  • 5,145
  • 34
  • 37
  • 1
    i'll try this, meantime is there any python api to convert to above format. i.e. automatically append the characters based on required encoding – Rakesh Aug 02 '11 at 14:13
  • 1
    I'm not sure - I just remembered that part from when I was messing around with this issue a while ago. [This answer](http://stackoverflow.com/questions/5910104/python-how-to-send-utf-8-e-mail#answer-5910530) seems to suggest that it is done right if you use the MIMEMultipart class instead of MimeWriter. – AHM Aug 02 '11 at 14:27
  • 2
    you should look up here [how to generate internationalized headers](http://docs.python.org/library/email.header.html) – mata May 20 '12 at 21:39
1

In short, if you use the EmailMessage API, you should code like this:

from email.message import EmailMessage
from email.header import Header
msg = EmailMessage()
msg['Subject'] = Header('主題', 'utf-8').encode()

Answer from @Sérgio cannot be used in the EmailMessage API, cause only string object can be assigned to EmailMessage()["Subject"], but not an email.header.Header object.

C.K.
  • 1,409
  • 10
  • 20