0

I have this header from an e-mail message: (some headers only, from a log file, not the entire message)

From: =?UTF-8?B?TmFtZSDDpMO2w7w=?= <test@example.com>

How can I decode the part after the "From:" into the Unicode string that it was when I entered it into Thunderbird to send the message?

Other relevant occurrences are the Subject header where I think I've already seen something like this:

Subject: Hello =?UTF-8?B?TmFtZSDDpMO2w7w=?= more words

I already found the QuotedPrintableDecoder class in MimeKit but can't find out how to use that thing. It seems to want me to let it guess the decoded byte count and then convert the bytes from the hopefully large enough buffer into a string, but doesn't tell me what Encoding to use.

I need a short example of how to convert such header strings that contain partial quoted-printable words including a specified encoding. An example of the kind that one would expect to find in the documentation. Actually all data is right here but needs to be converted from one string to another.

Another hack using Attachment.CreateAttachmentFromString fails here. It can decode entire QP strings but not mixed formats like this one.

PS: If there are solutions in .NET Core 3.1 or 5.0 without the need for MimeKit, I'll happily accept those as well!

ygoe
  • 18,655
  • 23
  • 113
  • 210

1 Answers1

2

When you parse a message with MimeMessage.Load(), you don't need to decode headers because MimeKit will do it for you.

Secondly, your example header is not encoded using quoted-printable, it's encoded using rfc2047 tokens which would need to be decoded using Rfc2047.DecodeText():

var decoded = Rfc2047.DecodeText (Encoding.ASCII.GetBytes ("Hello =?UTF-8?B?TmFtZSDDpMO2w7w=?= more words"));
jstedfast
  • 35,744
  • 5
  • 97
  • 110
  • I don't have a complete .eml file, just the header from a log file. I guess the MimeMessage thing isn't helpful then. – ygoe Nov 29 '20 at 12:14
  • Then use the Rfc2047.DecodeText/DecodePhrase() methods. For address headers, I would recommend stripping off the "From:", "To:", or "Cc:" and then passing the remainder of the string to InternetAddressList.Parse() and it will parse *and* decode it for you. FOr the Subject header, strip off "Subject:" and pass the rest to Rfc2047.DecodeText(). – jstedfast Nov 30 '20 at 02:36
  • Note: Rfc2047.DecodeText() is for unstructured header fields like the Subject header (i.e. for free-form text content). Rfc2047.DecodePhrase() is for "phrases" such as the names in an address header. – jstedfast Nov 30 '20 at 02:38