0

I have an external data file that I am reading in for processing compressed in a gz file (hosted in S3). It contains json objects (1 per line and millions of lines per file) as mocked up data example below

{"a":"v1","b":"v2"}
{"a":"v3","b":"v2"}

I am using the following code to process this

JsonSerializer serializer = new JsonSerializer();
using (GZipStream decompressionStream = new GZipStream(data, CompressionMode.Decompress))
{
    using (StreamReader sr = new StreamReader(decompressionStream))
    {
        using (var reader = new JsonTextReader(sr))
        {
            while(reader.Read())
            {
                if(reader.TokenType == JsonToken.StartObject)
                {
                    var o = serializer.Deserialize<DataObject>(reader);
                }
            }
        }
    }
}

The DataObject is just a POCO data object to map the data into. First iteration works perfectly however I get an exception on the second execution on reader.Read().

Additional text encountered after finished reading JSON content

I think this could be due to the linefeed at the end of each json object but not sure how to resolve.

Any help would be very much appreciated

Mark Ruse
  • 387
  • 1
  • 4
  • 12
  • Set `reader.SupportMultipleContent = true;` as shown in [What is the correct way to use JSON.NET to parse stream of JSON objects?](https://stackoverflow.com/q/26601594) and [Line delimited json serializing and de-serializing](https://stackoverflow.com/q/29729063). – dbc Oct 14 '17 at 23:41
  • Thanks dbc - you are right and I think we found that article at exactly the same time :). Appreciate the response – Mark Ruse Oct 14 '17 at 23:42

0 Answers0