3

I am writing an application, which would receive encrypted byte array, consisting of file name and file bytes, with the following protocol: file_name_and_extension|bytes. Byte array is then decrypted and passing into Encoding.UTF8.getString(decrypted_bytes) would be preferable, because I would like to trim file_name_and_extension from the received bytes to save actual file bytes into file_name_and_extension.

I simplified my application, to only receive file bytes which are then passed into Encoding.UTF8.GetString() and back into byte array with Encoding.UTF8.getBytes(). After that, I am trying to write a zip file, but the file is invalid. It works when using ASCII or Base64.

private void Decryption(byte[] encryptedMessage, byte[] iv)
{
    using (Aes aes = new AesCryptoServiceProvider())
    {
        aes.Key = receiversKey;
        aes.IV = iv;
        // Decrypt the message
        using (MemoryStream decryptedBytes = new MemoryStream())
        {
            using (CryptoStream cs = new CryptoStream(decryptedBytes, aes.CreateDecryptor(), CryptoStreamMode.Write))
            {
                cs.Write(encryptedMessage, 0, encryptedMessage.Length);
                cs.Close();

                string decryptedBytesString = Encoding.UTF8.GetString(decryptedBytes.ToArray()); //corrupts the zip
                //string decryptedBytesString = Encoding.ASCII.GetString(decryptedBytes.ToArray()); //works
                //String decryptedBytesString = Convert.ToBase64String(decryptedBytes.ToArray()); //works

                byte[] fileBytes = Encoding.UTF8.GetBytes(decryptedBytesString);
                //byte[] fileBytes = Encoding.ASCII.GetBytes(decryptedBytesString);
                //byte[] fileBytes = Convert.FromBase64String(decryptedBytesString);
                File.WriteAllBytes("RECEIVED\\received.zip", fileBytes);

            }
        }
    }
}
  • UTF8 works and the other methods don't. Have you compared the original byte array to the final results? Ascii will remove non printable characters. UTF8 doesn't change change characters. You should never take binary data and convert to strings So Encoding.UTF8.GetString() shouldn't be used if the data is binary. Any way I think the error is being is not with the UTF8. – jdweng Jan 08 '17 at 14:58

1 Answers1

5

Because one shouldn't try to interpret raw bytes as symbols in some encoding unless he actually knows/can deduce the encoding used.

If you receive some nonspecific raw bytes, then process them as raw bytes.

But why it works/doesn't work?

Because:

  1. Encoding.Ascii seems to ignore values greater than 127 and return them as they are. So no matter the encoding/decoding done, raw bytes will be the same.
  2. Base64 is a straightforward encoding that won't change the original data in any way.
  3. UTF8 - theoretically with those bytes not being proper UTF8 string we may have some conversion data loss (though it would more likely result in an exception). But the most probable reason is a BOM being added during Encoding.UTF8.GetString call that would remain there after Encoding.UTF8.GetBytes.

In any case, I repeat - do not encode/decode anything unless it is actually string data/required format.

Community
  • 1
  • 1
Eugene Podskal
  • 10,270
  • 5
  • 31
  • 53