1

I have a peculiar situation. In a legacy system no longer used we have base64 values stored that we now need to access.

By converting the base64 value to a string I can see that the base64 value contains my properties needed like this.

enter image description here

The problem is that I can't deserialize neither the byte array or the string to a anonymous type object or dynamic. This is because I don't have access to the binaries that this object is using. In this example it is shown as ConsoleApp2.

First try:

public static object FromByteArray(byte[] data)
{
    BinaryFormatter bf = new BinaryFormatter();
    using (MemoryStream ms = new MemoryStream(data))
    {
        object obj = bf.Deserialize(ms);
        return obj;
    }
}

Source:

https://stackoverflow.com/a/33022788/3850405

System.Runtime.Serialization.SerializationException: 'Unable to find assembly 'ConsoleApp2, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'.'

enter image description here

Given that you normally can access properties from a plain Object Class I tried to set every assembly to System.Object with a SerializationBinder.

object o = new { A = "1", B = 2 };

enter image description here

public static object FromByteArray(byte[] data)
    {
        BinaryFormatter bf = new BinaryFormatter();
        using (MemoryStream ms = new MemoryStream(data))
        {
            bf.Binder = new PreMergeToMergedDeserializationBinder();
            object obj = bf.Deserialize(ms);
            return obj;
        }
    }
}

sealed class PreMergeToMergedDeserializationBinder : SerializationBinder
{
    public override Type BindToType(string assemblyName, string typeName)
    {
        var systemObjectAssembly = "System.Object, System.Runtime, Version=4.2.2.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a";

        return Type.GetType(systemObjectAssembly);
    }
}

Source:

https://stackoverflow.com/a/9012089/3850405

This prevents any runtime errors but everything that shows up looks like an empty object:

enter image description here

If I try to list properties using the code below it is of course empty as well.

Type myType = myObject.GetType();
List<PropertyInfo> props = new List<PropertyInfo>(myType.GetProperties());

enter image description here

What can I do to deserialize this base64 string and access the properties? Preferably I would not like to create a complete class hierarchy since the original object is quite large.

Runnable example program:

using System;
using System.Collections.Generic;
using System.IO;
using System.Reflection;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            object o = new { A = "1", B = 2 };

            var base64String = "AAEAAAD/////AQAAAAAAAAAMAgAAAEJDb25zb2xlQXBwMiwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwFAQAAABhDb25zb2xlQXBwMi5FeGFtcGxlTW9kZWwCAAAAHDxNeVByb3BlcnR5QT5rX19CYWNraW5nRmllbGQcPE15UHJvcGVydHlCPmtfX0JhY2tpbmdGaWVsZAEACAIAAAAGAwAAAAtNeVRlc3RWYWx1ZXsAAAAL";

            byte[] byteArray = Convert.FromBase64String(base64String);
            string objectInfo = System.Text.Encoding.UTF8.GetString(byteArray);

            var myObject = FromByteArray(byteArray);

            Type myType = myObject.GetType();
            List<PropertyInfo> props = new List<PropertyInfo>(myType.GetProperties());
        }

        public static object FromByteArray(byte[] data)
        {
            BinaryFormatter bf = new BinaryFormatter();
            using (MemoryStream ms = new MemoryStream(data))
            {
                bf.Binder = new PreMergeToMergedDeserializationBinder();
                object obj = bf.Deserialize(ms);
                return obj;
            }
        }
    }

    sealed class PreMergeToMergedDeserializationBinder : SerializationBinder
    {
        public override Type BindToType(string assemblyName, string typeName)
        {
            var systemObjectAssembly = "System.Object, System.Runtime, Version=4.2.2.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a";

            return Type.GetType(systemObjectAssembly);
        }
    }
}

Guess how the original object could have been stored in the first place:

using System;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApp2
{
    [Serializable]
    public class ExampleModel
    {
        public string MyPropertyA { get; set; }

        public int MyPropertyB { get; set; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var t = new ExampleModel();
            t.MyPropertyA = "MyTestValue";
            t.MyPropertyB = 123;

            var byteArray = ToByteArray<ExampleModel>(t);

            var base64String = Convert.ToBase64String(byteArray);
        }

        public static byte[] ToByteArray<T>(T obj)
        {
            if (obj == null)
                return null;
            BinaryFormatter bf = new BinaryFormatter();
            using (MemoryStream ms = new MemoryStream())
            {
                bf.Serialize(ms, obj);
                return ms.ToArray();
            }
        }
    }
}
Ogglas
  • 62,132
  • 37
  • 328
  • 418
  • 1
    One way would be to [parse it manually](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-nrbf/75b9fe09-be15-475f-85b8-ae7b7558cfe5?redirectedfrom=MSDN), perhaps by also referencing the [reference source](https://referencesource.microsoft.com/#mscorlib/system/runtime/serialization/formatters/binary/binaryformatter.cs). It's a painful path though. – vgru Oct 06 '20 at 10:33
  • @Groo Thanks, I did a variation of that but if possible I would still prefer a more generic approach and not have to map every property. – Ogglas Oct 06 '20 at 12:36
  • This is what happens when someone uses technology inappropriately. Binary serialization isn't meant for storage, it is meant for transport between different parts of your process(es). Long term storage creates maintenance problems because changes to your software or to the .NET framework might make the data hard to deserialize. As such, there are no good ready-made solutions to this problem because those problems weren't supposed to exist. – Lasse V. Karlsen Oct 07 '20 at 09:11
  • 1
    @LasseV.Karlsen Completely agree but sometimes you have to play the hand you're dealt. Unless time travel comes around anytime soon. :) – Ogglas Oct 07 '20 at 09:17
  • 1
    If time travel becomes possible in the future, don't you think us developers would be among the first *that would already know*? I mean, who hasn't had to maintain code that is just "Oh, how I wish I knew what I now know back then when I was a version 0.9 developer". First thing I would do would be to travel back in time to slap myself hard. Several times, and then I would queue up to slap the developer of Javascript. Probably be a long queue though but then time would no longer be a problem would it :) – Lasse V. Karlsen Oct 07 '20 at 09:19
  • @LasseV.Karlsen Haha so true! I almost feel bad for Brendan Eich in September 1995 if time travel becomes available... – Ogglas Oct 07 '20 at 09:35

1 Answers1

1

This is far from ideal but if you only need small amounts of data it is possible.

I started by looking at the string generated:

\0\u0001\0\0\0����\u0001\0\0\0\0\0\0\0\f\u0002\0\0\0BConsoleApp2, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null\u0005\u0001\0\0\0\u0018ConsoleApp2.ExampleModel\u0002\0\0\0\u001c<MyPropertyA>k__BackingField\u001c<MyPropertyB>k__BackingField\u0001\0\b\u0002\0\0\0\u0006\u0003\0\0\0\vMyTestValue{\0\0\0\v

From there I could see the values ConsoleApp2.ExampleModel with MyPropertyA and MyPropertyB. We could look at hex for MemberTypeInfo but in my case I used object for every property to save time.

If you want to know more about how the binary format of serialized .NET objects look like and how can it can be interpreted correctly I recommend this thread:

https://stackoverflow.com/a/30176566/3850405

I then created a namespace and a class with the properties for ConsoleApp2.ExampleModel.

namespace ConsoleApp2
{
    [Serializable]
    public class ExampleModel
    {
        public object MyPropertyA { get; set; }

        public object MyPropertyB { get; set; }
    }
}

After that I used the dynamic SerializationBinder from the source below:

sealed class PreMergeToMergedDeserializationBinder : SerializationBinder
{
    public override Type BindToType(string assemblyName, string typeName)
    {
        Type typeToDeserialize = null;

        // For each assemblyName/typeName that you want to deserialize to
        // a different type, set typeToDeserialize to the desired type.
        String exeAssembly = Assembly.GetExecutingAssembly().FullName;


        // The following line of code returns the type.
        typeToDeserialize = Type.GetType(String.Format("{0}, {1}",
            typeName, exeAssembly));

        return typeToDeserialize;
    }
}

https://stackoverflow.com/a/9012089/3850405

After doing that I could get the values I needed.

enter image description here

In the real example I only matched the actual properties that I needed and ignored the rest.

enter image description here

Complete example:

using ConsoleApp2;
using System;
using System.Collections.Generic;
using System.IO;
using System.Reflection;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApp2
{
    [Serializable]
    public class ExampleModel
    {
        public object MyPropertyA { get; set; }

        public object MyPropertyB { get; set; }
    }
}

namespace ConsoleApp1
{
    class Program
    {

        static void Main(string[] args)
        {
            object o = new { A = "1", B = 2 };

            var base64String = "AAEAAAD/////AQAAAAAAAAAMAgAAAEJDb25zb2xlQXBwMiwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwFAQAAABhDb25zb2xlQXBwMi5FeGFtcGxlTW9kZWwCAAAAHDxNeVByb3BlcnR5QT5rX19CYWNraW5nRmllbGQcPE15UHJvcGVydHlCPmtfX0JhY2tpbmdGaWVsZAEACAIAAAAGAwAAAAtNeVRlc3RWYWx1ZXsAAAAL";

            byte[] byteArray = Convert.FromBase64String(base64String);
            string objectInfo = System.Text.Encoding.UTF8.GetString(byteArray);

            var myObject = FromByteArray<ExampleModel>(byteArray);

            Type myType = myObject.GetType();
            List<PropertyInfo> props = new List<PropertyInfo>(myType.GetProperties());
        }

        public static T FromByteArray<T>(byte[] data)
        {
            if (data == null)
                return default(T);
            BinaryFormatter bf = new BinaryFormatter();
            using (MemoryStream ms = new MemoryStream(data))
            {
                bf.Binder = new PreMergeToMergedDeserializationBinder();
                object obj = bf.Deserialize(ms);
                return (T)obj;
            }
        }

    }

    sealed class PreMergeToMergedDeserializationBinder : SerializationBinder
    {
        public override Type BindToType(string assemblyName, string typeName)
        {
            Type typeToDeserialize = null;

            // For each assemblyName/typeName that you want to deserialize to
            // a different type, set typeToDeserialize to the desired type.
            String exeAssembly = Assembly.GetExecutingAssembly().FullName;


            // The following line of code returns the type.
            typeToDeserialize = Type.GetType(String.Format("{0}, {1}",
                typeName, exeAssembly));

            return typeToDeserialize;
        }
    }
}

A few lessons learnt:

Values like ConsoleApp2.ExampleModel_someItems without a k__BackingField is probably a field without get and a set methods declared like this:

public object _someItems;

A value like System.Collections.Generic.List`1[[ConsoleApp2.ExampleModelListItem, ConsoleApp2, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]] needs to be handled individually in SerializationBinder. I did it like this:

if (typeName.Contains("System.Collections.Generic.List") && typeName.Contains("ExampleModelListItem"))
{
    var t = new List<ExampleModelListItem>();
    typeToDeserialize = t.GetType();
}

If you try to do something like this:

if (typeName.Contains("System.Collections.Generic.List"))
{
    var t = new List<object>();
    typeToDeserialize = t.GetType();
}

It will result in a exception similar to:

'Object of type 'System.Collections.Generic.List`1[System.Object]' cannot be converted to type 'System.Collections.Generic.List`1[ConsoleApp2.ExampleModelListItem]'.'

Something similar to ConsoleApp2.ExampleModel+EnumTypes means a nested class or enum.

Solved like this:

namespace ConsoleApp2
{
    [Serializable]
    public class ExampleModel
    {
        public object MyPropertyA { get; set; }

        public enum EnumTypes
        {
            a
        }
    }
}

Source:

https://stackoverflow.com/a/2443261/3850405

Ogglas
  • 62,132
  • 37
  • 328
  • 418