1

Consider the following code, which uses FSharp.Data to request data from a web resource

let resp = Http.RequestStream(url, headers, query)
use rdr = new StreamReader(resp.ResponseStream)
use jrdr = new JsonTextReader(rdr)
let serializer = new JsonSerializer()
let myArray = serializer.Deserialize<someType[]>(jrdr).Value

myArray is an array of someType. Arrays are eagerly evaluated so if I request a large amount of data, I will consume a large amount of RAM up front.

What if I ask json.net to give me a seq instead?

let resp = Http.RequestStream(url, headers, query)
use rdr = new StreamReader(resp.ResponseStream)
use jrdr = new JsonTextReader(rdr)
let serializer = new JsonSerializer()
let mySeq = serializer.Deserialize<someType seq>(jrdr).Value

If I iterate through mySeq and write it to a text file, is everything pulled form the stream and deserialized lazily? Or does the act of asking json.net to deserialize force everything to be eagerly evaluated at that point?

UPDATE

Following on from the accepted answer of dbc, a functional lazy function would be something like the following

let jsonSeqFromStream<'T>(stream:Stream) = seq{
    let serializer = JsonSerializer.CreateDefault()
    use rdr = new StreamReader(stream, Encoding.UTF8, true, 4096, true)
    use jrdr = new JsonTextReader(rdr, CloseInput = false)
    let rec resSeq inArray = seq{
        if jrdr.Read() then
            match jrdr.TokenType with
            |JsonToken.Comment -> yield! resSeq inArray
            |JsonToken.StartArray when not inArray -> yield! resSeq true
            |JsonToken.EndArray when inArray -> yield! resSeq false
            |_ ->
                let resObj = serializer.Deserialize<'T>(jrdr)
                yield resObj
                yield! resSeq inArray
        else
            ()
    }
    yield! resSeq false
}
Chechy Levas
  • 2,206
  • 1
  • 13
  • 28

1 Answers1

1

Json.NET deserialization of a sequence can be made lazy, but it is not so automatically. Instead, you will have to adapt one of the answers from Parsing large json file in .NET or Newtonsoft JSon Deserialize into Primitive type to f#.

To confirm that deserialization of a sequence is not lazy by default, define the following function:

let jsonFromStream<'T>(stream : Stream) =
    Console.WriteLine(typeof<'T>) // Print incoming type for debugging purpose
    let serializer = JsonSerializer.CreateDefault()
    use rdr = new StreamReader(stream, Encoding.UTF8, true, 4096, true)
    use jrdr = new JsonTextReader(rdr, CloseInput = false)
    let res = serializer.Deserialize<'T>(jrdr)
    Console.WriteLine(res.GetType()) // Print outgoing type for debugging purpose
    res

Then if we have some stream stream containing a JSON array of objects someType, and call the method like so:

let mySeq = jsonFromStream<someType seq>(stream)

Then the following debug output is generated:

System.Collections.Generic.IEnumerable`1[Oti4jegh9906+someType]
System.Collections.Generic.List`1[Oti4jegh9906+someType]

As you can see, from the .Net point of view, calling JsonSerializer.Deserialize<T>() with someType seq is just the same as calling it with IEnumerable<someType> from c#, and in such a case Json.NET materializes the result and returns it as a List<someType>.

Demo fiddle #1 here.

To parse the JSON array as a lazy sequence, you will need to manually create a seq function that iterates through the JSON with JsonReader.Read() and deserializes and yields each array entry:

let jsonSeqFromStream<'T>(stream : Stream) =
    seq {
        // Adapted from this answer https://stackoverflow.com/a/35298655
        // To https://stackoverflow.com/questions/35295220/newtonsoft-json-deserialize-into-primitive-type
        let serializer = JsonSerializer.CreateDefault()
        use rdr = new StreamReader(stream, Encoding.UTF8, true, 4096, true)
        use jrdr = new JsonTextReader(rdr, CloseInput = false)
        let inArray = ref false
        while jrdr.Read() do
            if (jrdr.TokenType = JsonToken.Comment) then
                ()
            elif (jrdr.TokenType = JsonToken.StartArray && not !inArray) then
                inArray := true
            elif (jrdr.TokenType = JsonToken.EndArray && !inArray) then
                inArray := false
            else
                let res = serializer.Deserialize<'T>(jrdr)
                yield res
    }

(Since tracking whether we are parsing array value(s) is stateful, this doesn't look very functional. Maybe it could be done better?)

The return of this function could be used as follows, e.g.:

let mySeq = jsonSeqFromStream<someType>(stream)

mySeq |> Seq.iter (fun (s) -> printfn "%s" (JsonConvert.SerializeObject(s)))

Demo fiddle #2 here.

dbc
  • 104,963
  • 20
  • 228
  • 340