2

I'm migrating data from an old properietary object database format using JSON as the intermediate format. The objects are output into a JSON array of objects, each of which has an initial field giving the type of the original object followed by field called Instance which has the nested original object.

I need to stream these in as there are potentially hundreds of thousands of them - I can't just read the whole JSON array into memory and then process it.

So the JSON looks like this:

[
{
    "Type": "Foo",
    "Instance": {
        // instance of Foo type
    }
},
{
    "Type": "Bar",
    "Instance": {
        // instance of Bar type
    }
},
// tens or hundreds of thousands more objects...
]

Using Json.NET, what's the best way to stream in one array element at a time, access the "Type" property and then deserialize the "Instance" to a .Net object of the appropriate type?

Edit: although there is a similar question regarding reading a large JSON array, the specifics of accessing the instance are not answered in that question.

Mike Scott
  • 12,274
  • 8
  • 40
  • 53
  • 1
    http://stackoverflow.com/a/17788118/996081 – cbr Mar 11 '15 at 18:30
  • Can you load each individual object into memory? – dbc Mar 11 '15 at 19:29
  • 2
    Your question has two independent parts: 1) Walk a huge JSON array item by item; 2) Given an item, deserialize it to a run-time type. For 1), see here: https://stackoverflow.com/questions/20374083/deserialize-json-array-stream-one-item-at-a-time/20386292#20386292. For 2), look at the techniques here: https://stackoverflow.com/questions/19307752/deserializing-polymorphic-json-classes-without-type-information-using-json-net – dbc Mar 11 '15 at 19:32

1 Answers1

1

Putting together answers to

First, assume you have a custom SerializationBinder (or something similar) that will map type names to types.

Next, you can enumerate through the top-level objects in streaming JSON data (walking into top-level arrays) with the following extension method:

public static class JsonExtensions
{
    public static IEnumerable<JObject> WalkObjects(TextReader textReader)
    {
        using (JsonTextReader reader = new JsonTextReader(textReader))
        {
            while (reader.Read())
            {
                if (reader.TokenType == JsonToken.StartObject)
                {
                    JObject obj = JObject.Load(reader);
                    if (obj != null)
                    {
                        yield return obj;
                    }
                }
            }
        }
    }
}

Then, assuming you have some stream for reading your JSON data, you can stream the JSON in and convert top-level array elements one by one for processing as follows:

        SerializationBinder binder = new MyBinder(); // Your custom binder.
        using (var stream = GetStream(json))
        using (var reader = new StreamReader(stream, Encoding.Unicode))
        {
            var assemblyName = System.Reflection.Assembly.GetExecutingAssembly().GetName().Name;
            var items = from obj in JsonExtensions.WalkObjects(reader)
                        let jType = obj["Type"]
                        let jInstance = obj["Instance"]
                        where jType != null && jType.Type == JTokenType.String
                        where jInstance != null && jInstance.Type == JTokenType.Object
                        let type = binder.BindToType(assemblyName, (string)jType)
                        where type != null
                        select jInstance.ToObject(type); // Deserialize to bound type!

            foreach (var item in items)
            {
                // Handle each item.
                Debug.WriteLine(JsonConvert.SerializeObject(item));
            }
        }
dbc
  • 104,963
  • 20
  • 228
  • 340