8

Is it possible to serialize to NDJSON (Newline Delimited JSON) using Json.NET? The Elasticsearch API uses NDJSON for bulk operations, and I can find nothing suggesting that this format is supported by any .NET libraries.

This answer provides guidance for deserializing NDJSON, and it was noted that one could serialize each row independently and join with newline, but I would not necessarily call that supported.

Nathan Taylor
  • 24,423
  • 19
  • 99
  • 156
  • That link points to a domain grab. It was created only a couple of years ago, while providers like AWS and Azure use newline-delimeted JSON for several years – Panagiotis Kanavos Mar 05 '18 at 12:16

2 Answers2

12

As Json.NET does not currently have a built-in method to serialize a collection to NDJSON, the simplest answer would be to write to a single TextWriter using a separate JsonTextWriter for each line, setting CloseOutput = false for each:

public static partial class JsonExtensions
{
    public static void ToNewlineDelimitedJson<T>(Stream stream, IEnumerable<T> items)
    {
        // Let caller dispose the underlying stream 
        using (var textWriter = new StreamWriter(stream, new UTF8Encoding(false, true), 1024, true))
        {
            ToNewlineDelimitedJson(textWriter, items);
        }
    }

    public static void ToNewlineDelimitedJson<T>(TextWriter textWriter, IEnumerable<T> items)
    {
        var serializer = JsonSerializer.CreateDefault();

        foreach (var item in items)
        {
            // Formatting.None is the default; I set it here for clarity.
            using (var writer = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
            {
                serializer.Serialize(writer, item);
            }
            // https://web.archive.org/web/20180513150745/http://specs.okfnlabs.org/ndjson/
            // Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). 
            // The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
            textWriter.Write("\n");
        }
    }
}

Sample fiddle.

Since the individual NDJSON lines are likely to be short but the number of lines might be large, this answer suggests a streaming solution to avoid the necessity of allocating a single string larger than 85kb. As explained in Newtonsoft Json.NET Performance Tips, such large strings end up on the large object heap and may subsequently degrade application performance.

dbc
  • 104,963
  • 20
  • 228
  • 340
  • Accepting as answer due to the use of a JsonTextWriter. It seems like this is the most sane approach in the context of what the library already provides, and it notably more performant than the other answer's approach of creating a new TextWriter for each line. – Nathan Taylor Jun 27 '17 at 20:42
  • Actually, the above is the answer that creates a JsonTextWriter for each line. – jlavallet Jun 27 '17 at 20:45
  • 1
    @jlavallet - `JsonConvert.SerializeObject()` internally creates both a `StringWriter` and a `JsonTextWriter`; see [here](https://github.com/JamesNK/Newtonsoft.Json/blob/master/Src/Newtonsoft.Json/JsonConvert.cs#L647) for details. Since the individual JSON lines are likely to be short but the number of lines might be large, I suggested a streaming solution to avoid allocating a single string larger than 85kb as recommended [here](http://www.newtonsoft.com/json/help/html/Performance.htm#MemoryUsage). – dbc Jun 27 '17 at 21:31
1

You could try this:

string ndJson = JsonConvert.SerializeObject(value, Formatting.Indented);

but now I see that you are not just wanting the serialized object to be pretty printed. If the object you are serializing is some kind of collection or enumeration, could you not just do this yourself by serializing each element?

StringBuilder sb = new StringBuilder();
foreach (var element in collection)
{
    sb.AppendLine(JsonConvert.SerializeObject(element, Formatting.None));
}

// use the NDJSON output
Console.WriteLine(sb.ToString());
jlavallet
  • 1,267
  • 1
  • 12
  • 33
  • It certainly would be valid to serialize one line at a time and append, but as I pointed out: this is not functionality I can get from Json.NET out-of-the-box. It's a fair question whether or not Json.NET _should_ support this format explicitly. What would be the input type for NDJson, an array of objects? – Nathan Taylor Jun 27 '17 at 20:04
  • I agree that it's a fair question whether Json.NET can support this out-of-the-box. – jlavallet Jun 27 '17 at 20:20
  • As to what the input type would be – I suppose from what I quickly read about the NDJSON format, that would depend on the context. It would be "a line of data" that should be separately handled from other "lines of data". What is your context? The line of data could be a simple object with a few properties, a complex object with multiple levels of sub objects, or just a string, You would have to tell me what should appear on each line. – jlavallet Jun 27 '17 at 20:26
  • @jlvallet NDJSON allows for _any_ valid JSON to be transmitted in this format. If you wanted to produce this output with a set of mixed objects in .NET, some type boxing/unboxing would be necessary. Anyway, it was meant more as a rhetorical prompt. Maybe the best implementation is simply to build a custom JsonTextWriter for this type of serialziation, eschewing any direct support in the library. – Nathan Taylor Jun 27 '17 at 20:40