14

I keep getting a Base64 invalid character error even though I shouldn't.

The program takes an XML file and exports it to a document. If the user wants, it will compress the file as well. The compression works fine and returns a Base64 String which is encoded into UTF-8 and written to a file.

When its time to reload the document into the program I have to check whether its compressed or not, the code is simply:

byte[] gzBuffer = System.Convert.FromBase64String(text);
return "1F-8B-08" == BitConverter.ToString(new List<Byte>(gzBuffer).GetRange(4, 3).ToArray());

It checks the beginning of the string to see if it has GZips code in it.

Now the thing is, all my tests work. I take a string, compress it, decompress it, and compare it to the original. The problem is when I get the string returned from an ADO Recordset. The string is exactly what was written to the file (with the addition of a "\0" at the end, but I don't think that even does anything, even trimmed off it still throws). I even copy and pasted the entire string into a test method and compress/decompress that. Works fine.

The tests will pass but the code will fail using the exact same string? The only difference is instead of just declaring a regular string and passing it in I'm getting one returned from a recordset.

Any ideas on what am I doing wrong?

Brandon
  • 68,708
  • 30
  • 194
  • 223
  • It would probably help if you'd post an example of a string that you're passing to Convert.FromBase64String (e.g. what you get on output if you put a Debug.Write directly before the call) – Daniel LeCheminant Apr 02 '09 at 19:34
  • ...even if you posted the first and last 8 or so bytes, and the string length, that would probably be enough to see that the string is the correct format. – Daniel LeCheminant Apr 02 '09 at 19:40
  • qGcAAB+LCA ... cAAA== Its 2376 characters long. – Brandon Apr 02 '09 at 19:44
  • Are you still getting a FormatException:Invalid Characters, or is it a different exception? – Daniel LeCheminant Apr 02 '09 at 19:48
  • ... I guess you must have figured out what the problem was? – Daniel LeCheminant Apr 02 '09 at 19:58
  • Yes, the \0 was actually the problem. (Well, one of them). I didn't see that as the issue in the tests because the parent method the tests call consider a non-base64 string an unformatted file. But all is well now, thanks for the help :) – Brandon Apr 02 '09 at 21:44

5 Answers5

16

You say

The string is exactly what was written to the file (with the addition of a "\0" at the end, but I don't think that even does anything).

In fact, it does do something (it causes your code to throw a FormatException:"Invalid character in a Base-64 string") because the Convert.FromBase64String does not consider "\0" to be a valid Base64 character.

  byte[] data1 = Convert.FromBase64String("AAAA\0"); // Throws exception
  byte[] data2 = Convert.FromBase64String("AAAA");   // Works

Solution: Get rid of the zero termination. (Maybe call .Trim("\0"))

Notes:

The MSDN docs for Convert.FromBase64String say it will throw a FormatException when

The length of s, ignoring white space characters, is not zero or a multiple of 4.

-or-

The format of s is invalid. s contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters.

and that

The base 64 digits in ascending order from zero are the uppercase characters 'A' to 'Z', lowercase characters 'a' to 'z', numerals '0' to '9', and the symbols '+' and '/'.

Daniel LeCheminant
  • 50,583
  • 16
  • 120
  • 115
  • I trim the \0 off, it still throws. – Brandon Apr 02 '09 at 19:23
  • It still throws a FormatException, or something else? What is the exact string being passed to FromBase64String? – Daniel LeCheminant Apr 02 '09 at 19:31
  • The exact string is a little bit long to post. Is there a size limit I don't know about? What is there is valid though, I checked it for any characters not allowed in Base64. Maybe I just did the trim wrong, although it doesn't explain why the tests are running fine. – Brandon Apr 02 '09 at 19:36
  • @Brandon: Is the length a multiple of 4? Honestly, even if you posted the first and last 8 bytes, and the string length, that would probably be enough to see that the string is the correct format. – Daniel LeCheminant Apr 02 '09 at 19:39
  • It is a multiple of 4, and I'm assuming the == at the end of the string (see my response to the original post) is there just for padding purposes? – Brandon Apr 02 '09 at 19:45
  • @Brandon: Yeah, it makes the length a multiple of 4 – Daniel LeCheminant Apr 02 '09 at 19:47
3

Whether null char is allowed or not really depends on base64 codec in question. Given vagueness of Base64 standard (there is no authoritative exact specification), many implementations would just ignore it as white space. And then others can flag it as a problem. And buggiest ones wouldn't notice and would happily try decoding it... :-/

But it sounds c# implementation does not like it (which is one valid approach) so if removing it helps, that should be done.

One minor additional comment: UTF-8 is not a requirement, ISO-8859-x aka Latin-x, and 7-bit Ascii would work as well. This because Base64 was specifically designed to only use 7-bit subset which works with all 7-bit ascii compatible encodings.

StaxMan
  • 113,358
  • 34
  • 211
  • 239
1
string stringToDecrypt = HttpContext.Current.Request.QueryString.ToString()

//change to string stringToDecrypt = HttpUtility.UrlDecode(HttpContext.Current.Request.QueryString.ToString())

Uday
  • 99
  • 1
  • 7
0

One gotcha to do with converting Base64 from a string is that some conversion functions use the preceding "data:image/jpg;base64," and others only accept the actual data.

SteveCav
  • 6,649
  • 1
  • 50
  • 52
0

If removing \0 from the end of string is impossible, you can add your own character for each string you encode, and remove it on decode.

abatishchev
  • 98,240
  • 88
  • 296
  • 433