16

I got a Json, which contains among others a data field which stores a base64 encoded string. This Json is serialized and send to a client.

On client side, the newtonsoft json.net deserializer is used to get back the Json. However, if the data field becomes large (~ 400 MB), the deserializer will throw an out of memory exception: Array Dimensions exceeded supported Range. I also see in Task-Manager, that memory consumption really grows fast.

Any ideas why this is? Is there a maximum size for json fields or something?

Code example (simplified):

HttpResponseMessage responseTemp = null;
responseTemp = client.PostAsJsonAsync(client.BaseAddress, message).Result;

string jsonContent = responseTemp.Content.ReadAsStringAsync.Result;
result = JsonConvert.DeserializeObject<Result>(jsonContent);

Result class:

public class Result
{

    public string Message { get; set; }
    public byte[] Data { get; set; }

}

UPDATE:

I think my problem is not the serializer, but just trying to handle such a huge string in memory. At the point where I read the string into memory, the memory consumption of the application explodes. Every operation on that string does the same. At the moment, I think I have to find a way to work with streams and stop reading the whole stuff into memory at once.

DanielG
  • 1,217
  • 1
  • 16
  • 37
  • Can you share the code for the object that is being created by deserialization? – Russ Clarke Nov 02 '15 at 15:00
  • just added a small code snippet. Would already help if someone can confirm that such huge base64 strings are not a problem in general. – DanielG Nov 02 '15 at 15:07
  • It would help to see the code for the 'result' type too. – Russ Clarke Nov 02 '15 at 15:07
  • 1
    Possible duplicate of [Incremental JSON Parsing in C#](http://stackoverflow.com/questions/9026508/incremental-json-parsing-in-c-sharp) – Nasreddine Nov 02 '15 at 15:08
  • @DanielG Is it the JsonConvert.DeserializeObject that fails? – Russ Clarke Nov 02 '15 at 15:08
  • yes, that's the one that fails with OOM Exception. Just added the result class, which is really simple. – DanielG Nov 02 '15 at 15:11
  • http://stackoverflow.com/questions/21151125/deserializing-large-json-objects-from-web-service-out-of-memory – Robert Nov 02 '15 at 15:19
  • I already tried that, but doesn't make any difference. Exact same error. Just posted the string version to keep the code simple. – DanielG Nov 02 '15 at 15:20
  • I also tried using the JsonTextReader directly. That also doesn't work. It will fail when it comes to JsonTextReader.Read on the huge field. And it throws the exact same error. – DanielG Nov 02 '15 at 15:22
  • In c#, [an array can only hold `int.MaxValue` items](http://stackoverflow.com/questions/1391672). From the error message it sounds like Json.NET is trying to create an array larger than this. What is the full `ToString()` output of the exception *including the traceback*? – dbc Nov 02 '15 at 18:21

4 Answers4

23

To read large JSON string with use of JsonConvert.DeserializeObject will consume your lots of memory. So One of the ways to over come from this issue, you can create an instance of JsonSerializer as given below.

 using (StreamReader r = new StreamReader(filePath))
 {
          using (JsonReader reader = new JsonTextReader(r))
         {
                JsonSerializer serializer = new JsonSerializer();
                T lstObjects = serializer.Deserialize<T>(reader);
        }
}

Here filePath :- is your current Json file and T :- is your Generic type object.

Brandon Minnick
  • 13,342
  • 15
  • 65
  • 123
Dilip Langhanoja
  • 4,455
  • 4
  • 28
  • 37
  • 2
    This should be the accepted answer. It presents a good practice instead of the answers that do not cure the problem but only try to minimize the risk. Thanks! – Mariusz Ignatowicz Aug 24 '17 at 17:28
  • this solution solved a huge memory leak problem for me – MIKE Dec 05 '17 at 19:04
  • His question is not about file path but for a string as json. Can you please extend your answer? – Emil Dec 09 '17 at 16:40
  • I am facing an issue where even with the use of a StreamReader to deserialize json data, it still getting the out of memory problem. Maybe this solution together with gcAllowVeryLargeObjects should do the trick. – Fabiano Aug 02 '19 at 19:40
  • @Fabiano Me too. Did you ever solve this? – pushkin Jul 19 '22 at 21:59
  • 1
    @pushkin To be frank with you, I don't remember exactly how this was solved. But the working solution today is that we are compressing the data before serializing and deserializing. If you deserialize a compressed data, you won't have this out of memory problem. After that, you can just uncompress it and you will be good to go. – Fabiano Jul 24 '22 at 00:23
12

You have two problems here:

  1. You have a single Base64 data field inside your JSON response that is larger than ~400 MB.

  2. You are loading the entire response into an intermediate string jsonContent that is even larger since it embeds the single data field.

Firstly, I assume you are using 64 bit. If not, switch.

Unfortunately, the first problem can only be ameliorated and not fixed because Json.NET's JsonTextReader does not have the ability to read a single string value in "chunks" in the same way as XmlReader.ReadValueChunk(). It will always fully materialize each atomic string value. But .Net 4.5 adds the following settings that may help:

  1. <gcAllowVeryLargeObjects enabled="true" />.

    This setting allows for arrays with up to int.MaxValue entries even if that would cause the underlying memory buffer to be larger than 2 GB. You will still be unable to read a single JSON token of more than 2^31 characters in length, however, since JsonTextReader buffers the full contents of each single token in a private char[] _chars; array, and, in .Net, an array can only hold up to int.MaxValue items.

  2. GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce.

    This setting allows the large object heap to be compacted and may reduce out-of-memory errors due to address space fragmentation.

The second problem, however, can be addressed by streaming deserialization, as shown in this answer to this question by Dilip0165; Efficient api calls with HttpClient and JSON.NET by John Thiriet; Performance Tips: Optimize Memory Usage by Newtonsoft; and Streaming with New .NET HttpClient and HttpCompletionOption.ResponseHeadersRead by Tugberk Ugurlu. Pulling together the information from these sources, your code should look something like:

Result result;
var requestJson = JsonConvert.SerializeObject(message); // Here we assume the request JSON is not too large
using (var requestContent = new StringContent(requestJson, Encoding.UTF8, "application/json"))
using (var request = new HttpRequestMessage(HttpMethod.Post, client.BaseAddress) { Content = requestContent })
using (var response = client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).Result)
using (var responseStream = response.Content.ReadAsStreamAsync().Result)
{
    if (response.IsSuccessStatusCode)
    {
        using (var textReader = new StreamReader(responseStream))
        using (var jsonReader = new JsonTextReader(textReader))
        {
            result = JsonSerializer.CreateDefault().Deserialize<Result>(jsonReader);
        }
    }
    else
    {
        // TODO: handle an unsuccessful response somehow, e.g. by throwing an exception
    }
}

Or, using async/await:

Result result;
var requestJson = JsonConvert.SerializeObject(message); // Here we assume the request JSON is not too large
using (var requestContent = new StringContent(requestJson, Encoding.UTF8, "application/json"))
using (var request = new HttpRequestMessage(HttpMethod.Post, client.BaseAddress) { Content = requestContent })
using (var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead))
using (var responseStream = await response.Content.ReadAsStreamAsync())
{
    if (response.IsSuccessStatusCode)
    {
        using (var textReader = new StreamReader(responseStream))
        using (var jsonReader = new JsonTextReader(textReader))
        {
            result = JsonSerializer.CreateDefault().Deserialize<Result>(jsonReader);
        }
    }
    else
    {
        // TODO: handle an unsuccessful response somehow, e.g. by throwing an exception
    }
}           

My code above isn't fully tested, and error and cancellation handling need to be implemented. You may also need to set the timeout as shown here and here. Json.NET's JsonSerializer does not support async deserialization, making it a slightly awkward fit with the asynchronous programming model of HttpClient.

Finally, as an alternative to using Json.NET to read a huge Base64 chunk from a JSON file, you could use the reader returned by JsonReaderWriterFactory which does support reading Base64 data in manageable chunks. For details, see this answer to Parse huge OData JSON by streaming certain sections of the json to avoid LOH for an explanation of how stream through a huge JSON file using this reader, and this answer to Read stream from XmlReader, base64 decode it and write result to file for how to decode Base64 data in chunks using XmlReader.ReadElementContentAsBase64

dbc
  • 104,963
  • 20
  • 228
  • 340
  • yeah, I'm on 64 bit. Setting gcAllowVeryLargeObjects helped. But however, since memory is increasing very fast, I will try to change the handling and avoid reading the stuff in to memory. Thanks! – DanielG Nov 03 '15 at 09:28
  • 1
    @DanielG - If you decide to do the parsing manually, take a look at [Decode base64 filestream with FromBase64Transform](http://stackoverflow.com/questions/30121310/decode-base64-filestream-with-frombase64transform). The 2nd answer shows how you can stream & decode base64 character data of any length. – dbc Nov 03 '15 at 09:33
  • What should be the solution for earlier versions of .NET 4.5? – Fabiano Aug 02 '19 at 19:44
  • @Fabiano - is your problem that the JSON response overall is huge, or that it contains a *single property* that is huge? If the response is huge use streaming as shown in the other answer. If a single property is huge then it's not obvious what to do. Can you explain in more detail? – dbc Aug 02 '19 at 21:42
  • @dbc, we have figured out here that the object to be deserialized has 30 MB, and the OutOfMemory exception is being raised. It is not a single property, but the whole object that is huge. We are thinking about zipping the string instead of deserializing it. And then, the other side can handle the unzip and the treatment for the rest. Unless there is another good idea to deserialize huge objects with .NET 4.0 :) – Fabiano Aug 02 '19 at 22:11
  • @Fabiano - If the JSON is huge but the individual properties are not then 1) Be sure you are running in 64 bit; and 2) Stream directly as shown in the other answer. 30 MB isn't actually a huge amount of memory to allocate overall -- it's just a problem to allocate *contiguously*. If streaming doesn't fix your problem you should ask another question and include a [mcve]. – dbc Aug 02 '19 at 22:23
2

Huge base64 strings aren't a problem as such, .Net supports object sizes of around 2gb, see the answer here. Of course, that doesn't mean you can store 2gb of information in an object!

However, I get the feeling that it's the byte[] that's the problem.

If there's too many elements for a byte[] to contain, it doesn't matter if you stream the result or even read it from a file on your hard drive.

So, just for testing purposes, can you trying changing the type of that from byte[] to string or even perhaps a List? It's not elegant or event perhaps advisable, but it might point the way to a better solution.

Edit:

Another test case to try, instead of calling deserializeObject, try just saving that jsonContent string to a file, and see how big it is?

Also, why do you need it in memory? What sort of data is it? It seems to me that if you've got to process this in memory then you're going to have a bad time - the size of the object is simply too large for the CLR.

Just had a little inspiration however, what about trying a different deserializer? Perhaps RestSharp or you can use HttpClient.ReadAsAsync<T>. It is possible that it's NewtonSoft itself that has a problem, especially if the size of the content is around 400mb.

Community
  • 1
  • 1
Russ Clarke
  • 17,511
  • 4
  • 41
  • 45
  • Just switched to a pure string. Unfortunately it doesn't make any difference. I get the exact same errors. In Taskmanager I see that memory grows up to 3.5 GB within seconds. – DanielG Nov 02 '15 at 15:59
  • That's a real shame... I just updated my answer with another suggestion. – Russ Clarke Nov 02 '15 at 16:17
  • ok thanks again. I don't need it in memory. It will be written to a database later. Unfortunately I need to handle it that way because of some ugly legacy code. I will try another serializer...that's a good idea. – DanielG Nov 02 '15 at 16:41
0

If you know the JSON result is an array, you can use this (based on https://stackoverflow.com/a/24115672/989451). It helped me with avoid out of memory exceptions. This will read and deserialize each array item separately:

public static IEnumerable<T> DeserializeSequenceFromJson<T>(TextReader readerStream)
{
    using (var reader = new JsonTextReader(readerStream))
    {
        var serializer = new JsonSerializer();

        // find start of array
        while (reader.TokenType != JsonToken.StartArray)
        {
            reader.Read();
        }

        // deserialize each item until end of array
        while (reader.Read())
        {
            if (reader.TokenType == JsonToken.EndArray) break;
            var item = serializer.Deserialize<T>(reader);
            yield return item;
        }
    }
}
Jeroen K
  • 10,258
  • 5
  • 41
  • 40