11

I'm working on an email project. For reasons that I will not go into here, doing quoted-printable encoding on long email messages is problematic in the customer's environment.

Doing base64-encoding on the HTML and text sections of the SMTP emails we are sending seems like a viable option. In testing it, it seems to work just fine in a couple test clients (like Gmail).

However I'm wondering if this will present any issues across different email clients. From reading the RFC specs, it looks like base64 is a compliant encoding for text sections, but it's unusual enough for text & html sections that I'd like to know if there will be any potential issues to consider.

Things that seem like problematic possibilities:

  • perhaps some older or less robust clients don't expect base64 in text or HTML email sections, and will fail to encode it
  • perhaps some email clients do a preview or search based on the raw content, so recipients would see base64 instead of the actual content
  • perhaps base64 could negatively influence deliverability/spam scoring?

Does anybody have experiences they can share? This seems like a good solution but I'd like to make sure I'm not missing something.

jkraybill
  • 3,339
  • 27
  • 32

1 Answers1

14

This is hard to answer -- yes, quoted-printable is used more often simply because it wastes less bytes (on a mostly-ASCII text) and because the raw text of the mail body part resembles the decoded output (on a mostly-ASCII text). There is nothing which forbids using base64 for the textual message parts, though.

This is pretty much an open question -- you cannot ever be sure that a MUA somewhere is not hopelessly broken to the extent of not showing anything. There's a lot of "perhaps" in there, and you're right -- but the problem is that you will never know. If it will make you sleep better, the following companies all use base64-encoded HTML in the marketing spam I'm receiving:

  • Mellanox
  • Alza.cz
  • Aukro.cz
  • Journal of Modern Physics

Any MUA which can display embedded images has to include a base64 decoder. It is definitely possible that a MUA might explicitly refuse to use that code for decoding text/plain and text/html, but in that case, you're just screwed anyway.

As a fun fact, one of these companies is happy to break the UTF-8 encoded subject at the byte boundary, inside a multibyte character, and encode both halves of the text in separate encoded-words (RFC2047 terminology here).

Jan Kundrát
  • 3,700
  • 1
  • 18
  • 29
  • Thanks for your thoughts. Another data point I found which makes me feel that Base64 is an acceptable solution: if you paste high-ascii characters like smart quotes into Gmail, then send the email, Google encodes the text/plain segment using Base64. – jkraybill May 04 '13 at 06:47
  • 5
    The overhead of Base64 is constant: 1.33 the size of original data. But the overhead of QP depends on your mail language (assuming utf-8). If all your mail are in English (ASCII compatible), QP brings little overhead. For non-ASCII chars, QP-encoded data is 3 times the size of original one. If your mail is in East Asian languages like Chinese, Japanese and Korean, Base64 is more efficent than QP. – Zhuoyun Wei Feb 19 '16 at 05:12