0

List<T> has a private int field called _version that increments every time you perform an operation on the list. Unfortunately for me this field is also serialized during binary serialization, so two lists with identical content can generate different byte arrays.

What would be the easiest way to make the serialized byte arrays identical? Writing a SerializationSurrogate? Finding the field in the serialized byte array and setting it to zero? Manually traversing my object graph and setting _version to zero using reflection? Are there any serialization attributes I can use? Perhaps use a different collection class, which?

https://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs

EDIT: Added code to clarify:

List<string> l1 = new List<string>();
List<string> l2 = new List<string>();
l1.Add("Hi");
l2.Add("Hi");
l2.Clear();
l2.Add("Hi");

byte[] b1 = Serialize(l1);
byte[] b2 = Serialize(l2);   // Contents of b2 will not be the same as b1 

public byte[] Serialize(object o)
{
  using (MemoryStream stream = new MemoryStream())
  {
    BinaryFormatter formatter = new BinaryFormatter();
    formatter.Serialize(stream, o);
    return stream.ToArray();
  }
}

// One UGLY way, I'm not happy with
private void ClearListVersion(object list)
{
  if (list == null) return;
  FieldInfo fieldInfo = list.GetType().GetField("_version", BindingFlags.NonPublic | BindingFlags.Instance);
  fieldInfo.SetValue(list, 0);
}

Björn Morén
  • 693
  • 5
  • 14
  • I don't understand. When the field is also serialized, the contents should be equal, but you're saying that they're not? Are you trying to compare the _serialized_ data sets? – PMF Dec 14 '19 at 15:55
  • 1
    A [serialization surrogate](https://stackoverflow.com/q/13166105/3744182) looks to be the most reasonable. Convert the list to an array inside the surrogate, and serialize that. You're going to have similar problems with dictionary, hashtable, and any other .Net built-in collection though, so if your requirement is to "diff two object graphs" this approach seems fragile. – dbc Dec 14 '19 at 16:54
  • PMF: Correct, I am comparing the serialized byte arrays. Lets say you have two List objects, and they contain the exact same strings. When you serialize them to byte arrays, there is no guarantee those byte arrays will be identical. This is because of the _version field. – Björn Morén Dec 14 '19 at 17:25
  • Can you show us the actual code and type you are trying to serialize? – Jonathan Alfaro Dec 14 '19 at 17:40
  • Darkonekt: I have edited the question now. – Björn Morén Dec 14 '19 at 18:20
  • 1
    Then [C# implementation of deep/recursive object comparison in .net 3.5](https://stackoverflow.com/q/1539989/3744182), [How to compare two .NET object graphs for differences?](https://stackoverflow.com/q/6661259/3744182), [Best way to compare two complex objects](https://stackoverflow.com/q/10454519/3744182) and [Compare the content of two objects for equality](https://stackoverflow.com/q/375996/3744182) might help. – dbc Dec 14 '19 at 20:02

0 Answers0