-1

When you binary serialize an object in .Net using the BinaryFormatter you end up with a byte array which is obviously meaningless to humans.

Does this byte array correspond to a more meaningful string representation which is human readable? Or do you need to fully deserialize it in order to make it more human readable?

I would expect that the binary formatter has some intermediate string representation of the object which it uses before emitting the byte array. That would be perfect for my needs...

I tried Base64 encoding the byte array, but I just ended up with gibberish.

EDIT:

As explained in my answer, UTF8 encoding is the best you can get.

The reason I want to do this is so that I can diff two binarySerializations and only store the first serialization and the diff, and was interested in seeing how the serialization worked in order to work out how best to diff the byte array.

Yair Halberstadt
  • 5,733
  • 28
  • 60
  • amke it a string using following : byte[] data = null; string bytes = string.Join(" ",data.Select(x => x.ToString("x2"))); – jdweng Dec 19 '18 at 15:43
  • @jdweng That also results in gibberish I'm afraid... – Yair Halberstadt Dec 19 '18 at 15:45
  • 1
    See https://stackoverflow.com/a/30176566/292411 for an example output of `BinaryFormatter`. Perhaps for programmers it is not _complete_ gibberish, but the wife (or husband, depending on who's who eh!) wouldn't even try reading that. – C.Evenhuis Dec 19 '18 at 15:48
  • You are reading integers from bytes. Are the number suppose to be in a particular range? Binary data is usually a combination of different size objects and you have to read the objects based on the expect size. You have to find the specification of the binary data and read according to the spec. It is possible to read sections of a binary file once you know the structure. Binary Images usually have an ascii header at the beginning which give the file name, the type of image (like jpeg) and the size of the image. You can open an image with notepad and see the ascii header. – jdweng Dec 19 '18 at 15:53
  • Binary format is not meant to be read, rather stored/transported/deserialized. There is surely nothing string-alike between object and its binary serialized form. Base64 is used only to transport bytes as text, hexadecimal - to check something if you know exact format of data. Sometimes you can simply try to read binary as text (ignoring errors) if it contains ASCII to see those. Could you tell us what you actually want to do? – Sinatr Dec 19 '18 at 16:04

1 Answers1

0

How to analyse contents of binary serialization stream? discusses the format of binary-serialization in greater detail, and also has a link to an analyzer of sorts.

Theres no fully human readable intermediate representation, but using Console.WriteLine(System.Text.Encoding.UTF8.GetString(bytes)); will return something that might be workable depending on the exact purposes for which it's needed.

Note that only some bytes can be decoded using UTF8, as only parts of the byte array are UTF8 encoded. There'll be plenty of unfound-symbold in the resulting string.

As an example serialising the following and converting the result to a UTF8 string:

namespace MyNamespace
{
    [Serializable]
    public class Class
    {
        private readonly int _int = 42;

        public string String { get; } = "MyString";
    }
}

results in:

"    ????          ConsoleApp, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null   MyNamespace.Class   _int<String>k__BackingField   *    MyString"

Which isn't completely useless...

Yair Halberstadt
  • 5,733
  • 28
  • 60