3

My Python application uses email.header.Header (http://docs.python.org/2/library/email.header.html ) to encode all headers of outgoing email (including From header), just like indicated here: Encoding mail subject (SMTP) in Python with non-ASCII characters

It works perfectly for ASCII sender names, but for senders like

Adrian Płonka <pokemon@myservice.com>

it produces

From: =?utf-8?q?Adrian_P=C5=82onka_=3Cpokemon=40myservice=2Ecom=3E?=

Unfortunately, Gmail apparently doesn't like this way of encoding as it displays the sender as (unknown) and marks the whole message as Spam.

How do I properly encode non-ASCII senders?

Community
  • 1
  • 1
Marcin
  • 591
  • 5
  • 12
  • Show the call that you use to do the encoding, please. – msw Nov 17 '13 at 16:26
  • I'm using `message['From'] = email.header.Header(email.utils.formataddr((u'Adrian Płonka', u'pokemon@myservice.com'), charset))`.Anyway, the result (as I can see it both in the log and in Gmail's "show original message") is: `=?utf-8?q?Adrian_P=C5=82onka_=3Cpokemon=40myservice=2Ecom=3E?=` – Marcin Nov 17 '13 at 16:36

2 Answers2

5

The proper way to encode that is

From: =?utf-8?q?Adrian_P=C5=82onka?= <pokemon@myservice.com>

That is, only the name part, not the actual email terminus, may be RFC2047-encoded.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • What if the email address has international characters, e.g. 施設@blah.jp – jeznag Jul 03 '17 at 01:00
  • 1
    UTF-8 is supported in addresses with the [internationalization extensions](https://en.wikipedia.org/wiki/Email_address#Internationalization) and an address in this format would be embedded without encoding it; but this facility has only existed for a limited time, so it's unlikely to work very well in many clients just yet. But on systems which support this, you can use bare UTF-8 throughout the headers. See https://en.wikipedia.org/wiki/International_email – tripleee Jul 03 '17 at 03:53
2

The most elegant way I can think of in order to achieve the outcome from @tripleee's answer would be:

message['From'] = formataddr((charset.header_encode('Adrian Płonka'), 'pokemon@myservice.com'))

where charset is an email.charset.Charset object, I created with:

charset = email.charset.Charset()
charset.body_encoding = email.charset.QP
charset.header_encoding = email.charset.QP
charset.input_charset = 'utf-8'
charset.output_charset = 'utf-8'
charset.input_codec = 'utf-8'
charset.output_codec = 'utf-8'

It displays correctly with Gmail as well as with other providers.

This was non-trivial for me to find, hope it helps...

Marcin
  • 591
  • 5
  • 12
  • Has this changed with the `email` library overhaul introduced in Python 3.5, and made the default implementation in 3.6? – tripleee Jul 03 '17 at 03:55
  • 1
    To belatedly answer my own question, looks like the OP's original code `str(Header(formataddr((u'Adrian Płonka', u'pokemon@myservice.com'), 'utf-8')))` would now return the correct value `'=?utf-8?q?Adrian_P=C5=82onka?= '` as of Python 3.5.1. The new "proper" way to do it is probably a lot simpler still. – tripleee Oct 10 '17 at 11:42