In JSON.NET how to get a reference to every deserialized object?

Question

I'm attempting to implement IDeserializationCallback using JSON.NET. I'm deserializing an object, and I would like to generate a list of all the objects which were deserialized which implement IDeserializationCallback, what would be the best way to do this? Does JSON.NET have any appropriate extension point to facilitate this? I have a (seemingly) working solution below, however it is quite ugly, so I'm convinced there must be a better way to do this. Any help is appreciated, thanks!

    private static JsonSerializer serializer = new JsonSerializer();

    static cctor()
    {
        serializer.Converters.Add(new DeserializationCallbackConverter());
    }

    public static T Deserialize<T>(byte[] data)
    {
        using (var reader = new JsonTextReader(new StreamReader(new MemoryStream(data))))
        using (DeserializationCallbackConverter.NewDeserializationCallbackBlock(reader))
            return serializer.Deserialize<T>(reader);
    }

    private class DeserializationCallbackConverter : JsonConverter
    {
        [ThreadStatic]
        private static ScopedConverter currentConverter;

        public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
        {
            throw new NotImplementedException();
        }

        public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
        {
            return currentConverter.ReadJson(reader, objectType, serializer);
        }

        public override bool CanConvert(Type objectType)
        {
            return currentConverter == null ? false : currentConverter.CanConvert();
        }

        public override bool CanWrite
        {
            get { return false; }
        }

        public static IDisposable NewDeserializationCallbackBlock(JsonReader reader)
        {
            return new ScopedConverter(reader);
        }

        private class ScopedConverter : IDisposable
        {
            private JsonReader jsonReader;
            private string currentPath;
            private List<IDeserializationCallback> callbackObjects;

            public ScopedConverter(JsonReader reader)
            {
                jsonReader = reader;
                callbackObjects = new List<IDeserializationCallback>();
                currentConverter = this;
            }

            public object ReadJson(JsonReader reader, Type objectType, JsonSerializer serializer)
            {
                var lastPath = currentPath;
                currentPath = reader.Path;
                var obj = serializer.Deserialize(reader, objectType);
                currentPath = lastPath;

                var dc = obj as IDeserializationCallback;
                if (dc != null && callbackObjects != null)
                    callbackObjects.Add(dc);
                return obj;
            }

            public bool CanConvert()
            {
                return jsonReader.Path != currentPath;
            }

            public void Dispose()
            {
                currentConverter = null;
                foreach (var obj in callbackObjects)
                    obj.OnDeserialization(null);
            }
        }
    }

I think you want to [traverse](https://en.wikipedia.org/wiki/Graph_traversal) the graph and fill a list, so that each node in the graph is also referenced by an item in the list, right? — Geeky Guy, Sep 02 '16 at 20:11
Have you tried the reflection-based approach? In my experience it is not very slow. — Casey, Sep 02 '16 at 20:12
Casey: yes I have tried it, and it is slower (and far more complex) than I would like. It is a last resort option. The reflection approach also has the downside that I can't tell if a given object reference was actually just deserialized - it could be a reference to an existing (or freshly instantiated) object populated via ctor or [OnDeserialized] or something similar. — redec, Sep 02 '16 at 20:14
Then please see [this article](https://en.wikipedia.org/wiki/Breadth-first_search). You have to visit each object in the graph once, and then keep a reference ot it in a list. This is one of the most efficient ways to do it. Please do not use reflection for a task such as this, as not only it is not very efficient, it may open the gate for bad practices which will come back to haunt you later. — Geeky Guy, Sep 02 '16 at 20:15
How would I get this graph if not by reflection though? When I say "graph" is not some formalized graph data structure I have - it's a random C# object that has internal references to other objects (and they in turn potentially have references to more objects etc) — redec, Sep 02 '16 at 20:22
So, this question is far too broad and too undefined for Stack Overflow. Please do some research and if you have specific questions about some code, then ask here. If you can narrow down your question somewhat (but perhaps not down to the code level), it *might* be on topic over at [programmers.se], but please do read their help center to be sure. Also take a look at [this answer on a question regarding how Programmers differs from SO](http://meta.programmers.stackexchange.com/a/7183/222246) — Heretic Monkey, Sep 02 '16 at 20:27
Any code/data to show your problem, so that we can work on it to speed up? — L.B, Sep 02 '16 at 20:33
I understand what you're saying, and I think maybe I'm not doing a good job explaining what I'm trying to do, because to me it isn't broad at all - very simply, I'm deserializing an object using JSON.NET, and I need to find out what objects JSON.NET deserialized as part of that. I'm wondering if JSON.NET has any extension point or any facility which could help me with that. — redec, Sep 02 '16 at 20:35
@redec - if you just want a list of the *reference type* objects that Json.NET created, then that's straightforward. But for complex value types like it will be very difficult to collect those that were retained in the graph as boxed structs (e.g. as in [this question](https://stackoverflow.com/questions/15207260)) vs those that were embedded in some larger object via a property setter. Can you make your question more specific? — dbc, Sep 02 '16 at 21:08
Also, is your graph serialized with [`PreserveReferencesHandling`](http://www.newtonsoft.com/json/help/html/PreserveReferencesHandlingObject.htm)? — dbc, Sep 02 '16 at 21:13
Ok, my apologies guys, I've updated the question to have more specifics, and to show my hacking. — redec, Sep 02 '16 at 21:22
@dbc No to the preservereferences...and I'm not sure about the value types...technically from a 'solid infrastructure' perspective it would be correct to get value types which implement IDeserializationCallback also, but I'm not really concerned about them - realistically speaking it would likely never actually be needed, and if it ever is it likely will not be a big deal to change it to a reference type to get it to work. — redec, Sep 02 '16 at 21:26
@redec - Wait, do you actually need to implement support for `IDeserializationCallback`? If so please say so otherwise this is a perfect example of the [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). — dbc, Sep 02 '16 at 21:57
@dbc sadly yes, I need to implement support for IDeserializationCallback. We're refactoring an infrastructure component which serializes a large number of legacy objects which implement IDeserializationCallback (the existing serialization implementation uses BinaryFormatter). It is not an option to refactor the objects away from IDeserializationCallback as that's far too much work and risk (and really afaik there's really no good alternate options to accomplish the same thing there anyways...) — redec, Sep 02 '16 at 22:23

dbc · Accepted Answer · 2016-09-02T22:27:45.280

You can create a custom contract resolver that adds an extra, artificial OnDeserialized callback that tracks creation of reference type objects. Here's one example:

public interface IObjectCreationTracker
{
    void Add(object obj);

    ICollection<object> CreatedObjects { get; }
}

public class ReferenceObjectCreationTracker : IObjectCreationTracker
{
    public ReferenceObjectCreationTracker()
    {
        this.CreatedObjects = new HashSet<object>();
    }

    public void Add(object obj)
    {
        if (obj == null)
            return;
        var type = obj.GetType();
        if (type.IsValueType || type == typeof(string))
            return;
        CreatedObjects.Add(obj);
    }

    public ICollection<object> CreatedObjects { get; private set; }
}

public class ObjectCreationTrackerContractResolver : DefaultContractResolver
{
    readonly SerializationCallback callback = (o, context) =>
        {
            var tracker = context.Context as IObjectCreationTracker;
            if (tracker != null)
                tracker.Add(o);
        };

    protected override JsonContract CreateContract(Type objectType)
    {
        var contract = base.CreateContract(objectType);
        contract.OnDeserializedCallbacks.Add(callback);
        return contract;
    }
}

And then use it as follows:

public static class JsonExtensions
{
    public static T DeserializeWithTracking<T>(string json, out ICollection<object> objects)
    {
        var tracker = new ReferenceObjectCreationTracker();
        var settings = new JsonSerializerSettings
        {
            ContractResolver = new ObjectCreationTrackerContractResolver(),
            Context = new StreamingContext(StreamingContextStates.All, tracker),
            // Add other settings as required.  
            TypeNameHandling = TypeNameHandling.Auto, 
        };
        var obj = (T)JsonConvert.DeserializeObject<T>(json, settings);
        objects = tracker.CreatedObjects;
        return obj;
    }
}

Note that this only returns instances of non-string reference types. Returning instances of value types is more problematic as there is no obvious way to distinguish between a value type that eventually gets embedded into a larger object via a property setter and one that is retained in the object graph as a boxed reference, e.g. as shown in this question. If the boxed value type eventually gets embedded in some larger object there is no way to retain a direct reference to it.

Also note the use of StreamingContext.Context to pass the tracker down into the callback.

You may want to cache the contract resolver for best performance.

Update

In answer to the updated question of how to implement IDeserializationCallback with Json.NET, the above should work for reference types. For value types that implement this interface, you could:

Call the method immediately in the OnDeserialized callback rather than deferring it until serialization is complete, or
Throw an exception indicating that IDeserializationCallback is not supported for structs.

Awesome, thanks! That looks like it should come out way cleaner. I prolly won't be able to play with it again until next week, but that looks great! — redec, Sep 02 '16 at 23:46

In JSON.NET how to get a reference to every deserialized object?

1 Answers1

Linked