-1

I'm calling an API that return a response of type application/json. The response can be very tiny but it can be very huge, 500mb to 700mb. I would like to format the content (indentation, new lines, etc) and write the response to a json file with the help of System.Text.Json so the file can be read easely by an humain.

This code write the response stream directly to a file very efficiently on memory and speed but it doesn't format the json content. Since that the stream is directly written to the file, it doesn't take memory at all.

var response = await new HttpClient().GetAsync("an url", HttpCompletionOption.ResponseHeadersRead);
var responseStream = await response.Content.ReadAsStreamAsync();

using (var fileStream = File.Open(filePath, FileMode.Create))
{
    await responseStream.CopyToAsync(fileStream);
}

I tried this code to add the formatting but it doesn't seems right since that it use over 1gb of memory.

var response = await new HttpClient().GetAsync("an url", HttpCompletionOption.ResponseHeadersRead);
var responseStream = await response.Content.ReadAsStreamAsync();
var jsonDocument = System.Text.Json.JsonDocument.Parse(responseStream);
var jsonWriterOptions = new System.Text.Json.JsonWriterOptions()
{
    Indented = true
};

using (var fileStream = File.Open(filePath, FileMode.Create))
using (var jsonTextWriter = new System.Text.Json.Utf8JsonWriter(fileStream, jsonWriterOptions))
{
    jsonDocument.WriteTo(jsonTextWriter);
}

Is there a more optimized way, that use less memory, to deal with huge content?

Alexandre Jobin
  • 2,811
  • 4
  • 33
  • 43
  • Have you tried `string jsonFormatted = JValue.Parse(json).ToString(Formatting.Indented);`? – virouz98 Mar 24 '22 at 20:43
  • If I put 500mb of content into the jonFormatted variable, it will take a lot of memory! – Alexandre Jobin Mar 24 '22 at 20:46
  • By "efficient" it seems you are referring to "with minimal memory usage". I'd suggest updating your question accordingly. Otherwise, you are going to get answers that might be fast enough (so, in that sense, "efficient"), but that still consume a lot of memory by loading the entire file in memory. – julealgon Mar 24 '22 at 20:50
  • @julealgon, the person ask the same question as me but all the answers works only if you have a small stream. My need is to convert a huge file so their solutions will take too much memory and cpu. – Alexandre Jobin Mar 24 '22 at 21:04
  • I cancelled my close of this question as a dup of https://stackoverflow.com/questions/43747477/how-to-parse-huge-json-file-as-stream-in-json-net because you specifically tagged `System.Text.Json` -- but that answer should work for you if you use Newtonsoft.Json. – Kirk Woll Mar 24 '22 at 21:14
  • Streaming indentation doesn't seem to be implemented with System.Text.Json. See [On-the-fly formatting a stream of JSON using System.Text.Json](https://stackoverflow.com/q/70536520/3744182) which has no useful answers. (The upvoted answers all load into a `JsonDocument`.) – dbc Mar 25 '22 at 02:32
  • @AlexandreJobin notice that I'm aware the question I linked doesn't yet have satisfactory asnwers (i.e. memory-efficient ones), but in the end of the day, it is exactly the same question you are asking here. So I personally don't see the purpose of having 2 identical questions being asked which is why I marked as duplicate. – julealgon Mar 25 '22 at 13:21
  • 1
    @julealgon, you are right that it's a duplicate question. – Alexandre Jobin Mar 25 '22 at 13:44
  • @julealgon: Interestingly, we cannot vote to close this question as a duplicate of that one because it does not have an upvoted or accepted answer. – StriplingWarrior Mar 25 '22 at 15:19
  • I can still see my vote to close @StriplingWarrior . Are you unable to vote? – julealgon Mar 25 '22 at 15:40
  • @julealgon, at least, my question have a real solution. It's not with `System.Text.Json` but the solution with `Json.Net` is a real one. The other question you are referring to have no viable solution. – Alexandre Jobin Mar 25 '22 at 15:48
  • If you just restrict yourself to .NET built-in solutions beyond just System.Text.Json I believe you can do this with `JsonReaderWriterFactory`; would that answer your question? – dbc Mar 25 '22 at 16:24
  • See docs: [Read from a stream using Utf8JsonReader](https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-use-dom-utf8jsonreader-utf8jsonwriter?pivots=dotnet-6-0#read-from-a-stream-using-utf8jsonreader). In the same place, see an example of writing using Utf8JsonWriter. – Alexander Petrov Mar 25 '22 at 16:57
  • @julealgon: The current vote to close cites it being "opinion-based," which I don't agree with. Attempting to close it as a duplicate, and selecting the question dbc referenced, the UI gives an error because that question has no up-voted or accepted answers. – StriplingWarrior Mar 25 '22 at 18:25
  • 1
    @AlexandreJobin: If you don't care about doing this using System.Text.Json, I'd say you should remove that tag from the question. Your answer should probably cite [this answer](https://stackoverflow.com/a/30329731/120955) as a source, but I think it's reasonable to not consider this a duplicate of that answer's question, since this one specifically calls out the need for streaming, and neither that question nor its the highest-voted answer address that need. – StriplingWarrior Mar 25 '22 at 18:31
  • @StriplingWarrior ohhhh I see! I thought that was my vote, but it seems it was removed. Oh well... what a mess. – julealgon Mar 25 '22 at 19:25
  • 1
    @StriplingWarrior, at first, I wanted I solution with System.Text.Json but now we can say that is praticly not doable and this is way I explain it in my answer. About adding the source you have suggested, honestly, it wasn't a source when I posted my answer. In fact, I was playing around with Json.Net and found my own solution. But if it can help, I can add it. – Alexandre Jobin Mar 25 '22 at 19:49
  • I assumed you'd followed a link to that answer from a comment dbc had put on my now-deleted answer. But if you figured out how to do it on your own, no need to cite anybody else. :-) – StriplingWarrior Mar 25 '22 at 21:34

1 Answers1

2

It seems that there's no easy way to do it with System.Text.Json because you cannot use a Stream object with the System.Text.Json.Utf8JsonReader directly. To bypass this limitation, you need to put the file content into memory with the System.Text.Json.JsonDocument object so obviously, it will take up a lot of memory.

For now, with what a read on the web, the only solution that is memory efficient is to use Newtonsoft.Json library. Since that there's no parsing involved with this logic, no memory is used to make the conversion.

In this example, the stream come from an HttpClient response.

var response = await new HttpClient().GetAsync("an url", HttpCompletionOption.ResponseHeadersRead);
var responseStream = await response.Content.ReadAsStreamAsync();

using (var streamReader = new StreamReader(responseStream))
using (var jsonReader = new JsonTextReader(streamReader))
using (var streamWriter = File.CreateText(destinationFilePath))
using (var jsonTextWriter = new JsonTextWriter(streamWriter))
{
    jsonTextWriter.Formatting = Formatting.Indented;

    while (jsonReader.Read())
    {
        jsonTextWriter.WriteToken(jsonReader);
    }
}

This example show how to read an existing file

using (var streamReader = new StreamReader(sourceFilePath))
using (var jsonReader = new JsonTextReader(streamReader))
using (var streamWriter = File.CreateText(destinationFilePath))
using (var jsonTextWriter = new JsonTextWriter(streamWriter))
{
    jsonTextWriter.Formatting = Formatting.Indented;

    while (jsonReader.Read())
    {
        jsonTextWriter.WriteToken(jsonReader);
    }
}
Alexandre Jobin
  • 2,811
  • 4
  • 33
  • 43