12

I'm trying to pull out the subject and body of an email with .Net. It seems to go OK except for the text/html MessagePart. I'm not sure of the encoding etc - has anybody got this working OK? It errors for me when trying to convert.

Here is the raw string for the text/html Body Data

"PGRpdiBkaXI9Imx0ciI-dGV4dCBpbiBoZXJlPGJyPjwvZGl2Pg0K"

which throws an error.

"The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters."

Here is the code:

    UsersResource.MessagesResource.GetRequest gr = gs.Users.Messages.Get(userEmail, TextBox1.Text);
    gr.Format = UsersResource.MessagesResource.GetRequest.FormatEnum.Full;                
    Message m = gr.Execute();

            foreach (MessagePart p in m.Payload.Parts)
            {
                if (p.MimeType == "text/html")
                {
                    try
                    {
                        byte[] data = Convert.FromBase64String(p.Body.Data);
                        string decodedString = Encoding.UTF8.GetString(data);
                        Response.Write(decodedString);
                    }
                    catch (Exception ex) { }
                }
            }

I'm getting the decoding wrong???

Thanks for your help.

PNC
  • 1,932
  • 19
  • 36
  • Same issue with me - just to let you know you are not alone! – Devfly Jun 29 '14 at 00:11
  • Good to hear - I've tried a number of approaches with the same outcome. Having the same issue also with the whole raw message when trying to parse to my MIME parser. – PNC Jun 29 '14 at 03:41

4 Answers4

23

The body data appears to be base64url-encoded, not base64-encoded. The difference is the use of - and _, instead of + and /, in the encoding’s alphabet of 64 characters. One solution is to replace all - and _ characters with + and / respectively, before calling FromBase64String.

See https://www.rfc-editor.org/rfc/rfc4648#section-5

Community
  • 1
  • 1
user3788724
  • 246
  • 2
  • 2
  • You'll also want to do something like this (from a different question): `encoded = b64.replace(/_/g, '/').replace(/-/g,'+'); atob( unescape( encodeURIComponent( encoded ) ) );` – maxko87 Oct 18 '16 at 05:17
10

Here is the code I ended up using:

                foreach (MessagePart p in m.Payload.Parts)
                {
                    if (p.MimeType == "text/html")
                    {
                         byte[] data = FromBase64ForUrlString(p.Body.Data);
                         string decodedString = Encoding.UTF8.GetString(data);
                         Response.Write(decodedString);                            
                    }
                }

....

    public static byte[] FromBase64ForUrlString(string base64ForUrlInput)
    {
        int padChars = (base64ForUrlInput.Length % 4) == 0 ? 0 : (4 - (base64ForUrlInput.Length % 4));
        StringBuilder result = new StringBuilder(base64ForUrlInput, base64ForUrlInput.Length + padChars);
        result.Append(String.Empty.PadRight(padChars, '='));
        result.Replace('-', '+');
        result.Replace('_', '/');
        return Convert.FromBase64String(result.ToString());
    }

Good article http://www.codeproject.com/Tips/76650/Base-base-url-base-url-and-z-base-encoding

PNC
  • 1,932
  • 19
  • 36
4

On this page https://developers.google.com/gmail/api/v1/reference/users/messages/attachments/get

you can find that there is a .NET example with this code for decoding:

// Converting from RFC 4648 base64-encoding
// see http://en.wikipedia.org/wiki/Base64#Implementations_and_history
String attachData = attachPart.Data.Replace('-', '+');
attachData = attachData.Replace('_', '/');
byte[] data = Convert.FromBase64String(attachData);
lost in binary
  • 544
  • 1
  • 4
  • 11
0

The WebEncoders.Base64UrlDecode method in the Microsoft.AspNetCore.WebUtilities assembly can decode this now:

var bytes = WebEncoders.Base64UrlDecode(data);
Mike
  • 7,500
  • 8
  • 44
  • 62