I have found myself cornered, so here we go.
Context
I need to produce a fingerprint hash code for object diffing. Comparing the hashes of two sets of objects will need to tell me if there are identical objects with the same hash.
The fingerprint hash must be platform-independent. So I went for MD5 hashing.
I am working with a large Object model code base that is out of my control. All types that I will be passed for this fingerprinting can not be modified by me. I cannot add attribute or constructors or modify anything. That does not exclude that the types will change in the future. So any approach must be programmatic -- I cannot just create a Surrogate class to avoid the problem; at least, not manually.
However, performance is not a concern, so reflection has complete green-light.
In addition, I will need to be able to control the exclusion of properties from the hashing. If I exclude a certain property, two object that have all the properties identical to each other except that one will still need to get the same hash.
Issue: serializing to Byte[]
with hands tied on the legacy code
MD5 hashing requires the object to be Serialised in Byte[].
The serialisation requires the class to be marked as [Serializable]
. Which I cannot add to the legacy code, and naturally it can not be added at runtime either.
So I went for protobuf-net
.
Protobuf rightly fails when encountering types that implement an interface with Getter-only auto-properties:
public interface ISomeInterface
{
double Vpy { get; }
double Vy { get; }
double Vpz { get; }
...
}
Being this Interface implemented by many types, using Surrogates seems also a no-go (impractical, non maintainable).
I would just need to serialize, not to deserialize, so I don't see why the limitation of protobuf-net in this case. I understand protobuf-net would not be able to round-trip if needed, but I don't need to round-trip!
Question
Am I really cornered? Is there any alternative?
My code
As I said, this works perfectly but only if the objects do not have any property (or nested property) that is a type with a Getter-only auto property.
public static byte[] ToByteArray(this object obj, List<PropertyInfo> exclusionsProps = null)
{
if (exclusionsProps == null)
exclusionsProps = new List<PropertyInfo>();
// Protobuf-net implementation
ProtoBuf.Meta.RuntimeTypeModel model = ProtoBuf.Meta.TypeModel.Create();
AddPropsToModel(model, obj.GetType(), exclusionsProps);
byte[] bytes;
using (var memoryStream = new MemoryStream())
{
model.Serialize(memoryStream, obj);
bytes = memoryStream.GetBuffer();
}
return bytes;
}
public static void AddPropsToModel(ProtoBuf.Meta.RuntimeTypeModel model, Type objType, List<PropertyInfo> exclusionsProps = null)
{
List<PropertyInfo> props = new List<PropertyInfo>();
if (exclusionsProps != null)
props.RemoveAll(pr => exclusionsProps.Exists(t => t.DeclaringType == pr.DeclaringType && t.Name == pr.Name));
props
.Where(prop => prop.PropertyType.IsClass || prop.PropertyType.IsInterface).ToList()
.ForEach(prop =>
{
AddPropsToModel(model, prop.PropertyType, exclusionsProps); //recursive call
}
);
var propsNames = props.Select(p => p.Name).OrderBy(name => name).ToList();
model.Add(objType, true).Add(propsNames.ToArray());
}
Which I will then use as such:
foreach (var obj in objs)
{
byte[] objByte = obj.ToByteArray(exclusionTypes);
using (MD5 md5Hash = MD5.Create())
{
string hash = GetMd5Hash(md5Hash, objByte);
Console.WriteLine(obj.GetType().Name + ": " + hash);
}
}