4

I have the following code. When I check the value of variable i it is 16 bytes but then when the output is converted to Base64 it is 24 bytes.

   byte[] bytOut = ms.GetBuffer();
        int i = 0;
        for (i = 0; i < bytOut.Length; i++)
            if (bytOut[i] == 0)
                break;

        // convert into Base64 so that the result can be used in xml

        return System.Convert.ToBase64String(bytOut, 0, i);

Is this expected? I am trying to cut down storage and this is one of my problems.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
Emily
  • 227
  • 1
  • 5
  • 9
  • I would expect a 16-byte string to result in 22 base64 characters. – Gabe Jun 15 '11 at 14:44
  • If you're trying to compress text converting to Base64 is most definitely not the way to go. If you want to compress text one of the best options is GZip. You will get extremely high levels of compression with little fuss. – Chris Marisic Jun 15 '11 at 14:46
  • Not really trying to compress text. What I did was to Encrypt a string and then as part of the return it converts it to Base64 using the code above. My aim is to keep the returned string as small as possible. I start of with 6 characters, it encrypts to 16 and then Base64 makes it 24. It's getting bigger and bigger :-( – Emily Jun 15 '11 at 14:49
  • @Gabe: Most base64 also pads to 4-character boundary, like if the input byte length is not a clean multiple of 3. – Joel B Fant Jun 15 '11 at 14:52

4 Answers4

5

Base64 expresses the input string made of 8-bit bytes using 64 human-readable characters (64 characters = 6 bits of information).

The key to the answer of your question is that it the encoding works in 24 bit chunks, so every 24 bits or fraction thereof results in 4 characters of output.

16 bytes * 8 bits = 128 bits of information

128 bits / 24 bits per chunk = 5.333 chunks

So the final output will be 6 chunks or 24 characters.

The fractional chunks are handled with equal signs, which represent the trailing "null bits". In your case, the output will always end in '=='.

Glenn McElhoe
  • 301
  • 4
  • 2
2

Yes, you'd expect to see some expansion. You're representing your data in a base with only 64 characters. All those unprintable ASCII characters still need a way to be encoded though. So you end up with slight expansion of the data.

Here's a link that explains how much: Base64: What is the worst possible increase in space usage?

Edit: Based on your comment above, if you need to reduce size, you should look at compressing the data before you encrypt. This will get you the max benefit from compression. Compressing encrypted binary does not work.

Community
  • 1
  • 1
mfanto
  • 14,168
  • 6
  • 51
  • 61
  • Okay that explains it. I want to use the string that I encode as part of a www link? Is converting to Base64 the only way I can do this so that it will work? – Emily Jun 15 '11 at 14:45
  • @Emily: You could just [URI Escape](http://msdn.microsoft.com/en-us/library/system.uri.escape.aspx) the string. that will allow you to put it into a URL – Matt Ellen Jun 15 '11 at 14:51
  • Are you looking for UrlEncoding then? Have a look at http://stackoverflow.com/questions/575440/ – Jack Bolding Jun 15 '11 at 14:52
  • I think the encryption creates non-displayable characters in which case does the UrlEncoding still work? – Emily Jun 15 '11 at 14:56
0

This is because a base64 string can contain only 64 characters ( and that is because it should be displayable) in other hand and byte has a variety of 256 characters so it can contain more information in it.

Ehsan Zargar Ershadi
  • 24,115
  • 17
  • 65
  • 95
  • I don't know why people marked down your answer. I understood what you are saying and it explained why it needs more space. – Emily Jun 15 '11 at 14:50
  • @Emily: a base64 string can contain more than 64 characters, so this answer starts off wrong. A base64 character can have only 1 of 64 possible values, whereas a (8 bit) byte can have 1 of 256. If this is what Ehsan meant, that what should be in the answer. (p.s. not a downvoter) – Matt Ellen Jun 15 '11 at 14:55
  • sorry for my bad English, what i meant was that base64 can be made up of 64 characters. – Ehsan Zargar Ershadi Jun 15 '11 at 15:00
0

Base64 is a great way to represent binary data in a string using only standard, printable characters. It is not, however, a good way to represent string data because it takes more characters than the original string.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466