I'm interested in performance (speed, memory usage) comparison of two approaches how to deserialize HTTP response JSON payload using Newtonsoft.Json.
I'm aware of Newtonsoft.Json's Performance Tips to use streams, but I wanted to know more and have hard numbers. I've written simple benchmark using BenchmarkDotNet, but I'm bit puzzled by results (see numbers below).
What I got:
- parsing from stream is always faster, but not really much
- parsing small and "medium" JSON has better or equal memory usage when using string as input
- significant difference in memory usage starts to be seen with large JSON (where string itself ends up in LOH)
I didn't have time to do proper profiling (yet), I'm bit surprised by memory overhead with stream approach (if there's no error). Whole code is here.
?
- Is my approach correct? (usage of
MemoryStream
; simulatingHttpResponseMessage
and its content; ...) - Is there any issue with benchmarking code?
- Why do I see such results?
Benchmark setup
I'm preparing MemoryStream
to be used over and over within benchmark run:
[GlobalSetup]
public void GlobalSetup()
{
var resourceName = _resourceMapping[typeof(T)];
using (var resourceStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName))
{
_memory = new MemoryStream();
resourceStream.CopyTo(_memory);
}
_iterationRepeats = _repeatMapping[typeof(T)];
}
Stream deserialization
[Benchmark(Description = "Stream d13n")]
public async Task DeserializeStream()
{
for (var i = 0; i < _iterationRepeats; i++)
{
var response = BuildResponse(_memory);
using (var streamReader = BuildNonClosingStreamReader(await response.Content.ReadAsStreamAsync()))
using (var jsonReader = new JsonTextReader(streamReader))
{
_serializer.Deserialize<T>(jsonReader);
}
}
}
String deserialization
We first read JSON from stream to string, and then run deserialization - another string is being allocated, and after that used for deserialization.
[Benchmark(Description = "String d13n")]
public async Task DeserializeString()
{
for (var i = 0; i < _iterationRepeats; i++)
{
var response = BuildResponse(_memory);
var content = await response.Content.ReadAsStringAsync();
JsonConvert.DeserializeObject<T>(content);
}
}
Common methods
private static HttpResponseMessage BuildResponse(Stream stream)
{
stream.Seek(0, SeekOrigin.Begin);
var content = new StreamContent(stream);
content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
return new HttpResponseMessage(HttpStatusCode.OK)
{
Content = content
};
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static StreamReader BuildNonClosingStreamReader(Stream inputStream) =>
new StreamReader(
stream: inputStream,
encoding: Encoding.UTF8,
detectEncodingFromByteOrderMarks: true,
bufferSize: 1024,
leaveOpen: true);
Results
Small JSON
Repeated 10000 times
- Stream: mean 25.69 ms, 61.34 MB allocated
- String: mean 31.22 ms, 36.01 MB allocated
Medium JSON
Repeated 1000 times
- Stream: mean 24.07 ms, 12 MB allocated
- String: mean 25.09 ms, 12.85 MB allocated
Large JSON
Repeated 100 times
- Stream: mean 229.6 ms, 47.54 MB allocated, objects got to Gen 1
- String: mean 240.8 ms, 92.42 MB allocated, objects got to Gen 2!
Update
I went trough source of JsonConvert
and found out that it internally uses JsonTextReader
with StringReader
when deserializing from string
: JsonConvert:816. Stream is involved there as well (of course!).
Then I decided to dig more into StreamReader
itself and I was stunned at first sight - it is always allocating array buffer (byte[]
): StreamReader:244, which explains its memory use.
This gives me answer to "why". Solution is simple - use smaller buffer size when instantiating StreamReader
- minimum buffer size defaults to 128 (see StreamReader.MinBufferSize
), but you can supply any value > 0
(check one of ctor overload).
Of course buffer size has effect on processing data. Answering what buffer size I should then use: it depends. When expecting smaller JSON responses, I think it is safe to stick with small buffer.