0

Json.Net allows new lines in a string value during deserialization which is against the JSON specification - how to prevent that and make JSON.Net to strictly enforce JSON rules?

We have some server side code that uses Newtonsoft to parse some JSON. The same JSON seems to fail to parse in javascript, and mysql's JSON_VALID function returns 0. Just wondering if there is a way to have Newtonsoft be more strict about deserialization. Example, here is code that runs, that should throw an exception because JSON can not have embedded new lines in strings.

string jsonStr = "{ \"bob\":\"line1\nline2\" }";
var obj = Newtonsoft.Json.JsonConvert.DeserializeObject(jsonStr);

If you look at jsonStr in the debugger, specifically using the text visualizer, you see the line break. As expected this exact string gets passed to an actual JavaScript engine, parsing fails:

 JSON.parse("{ \"bob\":\"line1\nline2\" }")

VM137:1 Uncaught SyntaxError: Unexpected token

Note that serialization code seems to do the "right" thing. i.e. escapes the slash in the new line when creating output.

public class Test
{
    public string Name { get; set;}
}
Test t = new Test();
t.Name = "Bob\nFrank";
string jsonOut = Newtonsoft.Json.JsonConvert.SerializeObject(t);

Am I missing something?

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
bpeikes
  • 3,495
  • 9
  • 42
  • 80
  • in C#, the two lines you wrote that declares jsonStr and the Deserialization.. they both work fine. Not sure what you mean by parsing fails. – Jawad Jan 05 '21 at 21:54
  • He means parsing fails in Javascript. Because of the slash. – Alex Hall Jan 05 '21 at 21:56
  • maybe that you need to escape the "\" in your string (turning it into a double-"\"), otherwise it will become a newline in the JSON source, not the JSON data... also, [thats](https://stackoverflow.com/questions/42068/how-do-i-handle-newlines-in-json) got an interesting discussion about that in the replies. – Michael Schönbauer Jan 05 '21 at 22:00
  • Can confirm that Json.NET is more forgiving that Javascript JSON and will accept newlines inside of strings during deserialization. – xanatos Jan 05 '21 at 22:01
  • And jsonOut is byte identical to jsonStr? – Caius Jard Jan 05 '21 at 22:02
  • I've updated title and added JavaScript sample - feel free to revert the change if it does not reflect your question (also I'd recommend to tone down the original title in case you revert/modify my edit). – Alexei Levenkov Jan 05 '21 at 22:16
  • As stated in the [answer below](https://stackoverflow.com/a/65587578/3744182) by xanatos, Json.NET does not offer "strict" parsing. Related but not exactly duplicate: [Validate if string is valid json (fastest way possible) in .NET Core 3.0](https://stackoverflow.com/q/58629279/3744182), [Disable Support for Reading (Invalid JSON) Single Quote Strings](https://stackoverflow.com/q/48236247/3744182), [json net leading zeros (disable base-cast)](https://stackoverflow.com/q/37561583/3744182), [How to enforce quotes on property names on JSON .NET](https://stackoverflow.com/q/53304218/3744182). – dbc Jan 05 '21 at 23:15

1 Answers1

3

I've debugged the Newtonsoft Json.NET and I'll say you can't. Everything that is interesting happens in the JsonTextReader class, and there is no useful override point. The path you are interested in is Read()->ParseValue()->ParseString()->ReadStringIntoBuffer() and there, toward the end there are:

case StringUtils.CarriageReturn:
    _charPos = charPos - 1;
    ProcessCarriageReturn(true);
    charPos = _charPos;
    break;
case StringUtils.LineFeed:
    _charPos = charPos - 1;
    ProcessLineFeed();
    charPos = _charPos;
    break;

that will "accept" newlines inside strings.

Worse, there is not even an overridable method or event about "begin and end of string parsing".

You could clearly rewrite the whole JsonTextReader, but you can't simply copy and paste it in a new file, because it uses internal classes of Json.NET (like StringBuffer, StringReference, StringUtils, CollectionUtils, ConvertUtils, MiscellaneousUtils (but only for .Assert) plus internal methods of JsonReader...)

Remember that if all you want is check if the Json is valid, you can try the other parsers that exist. Microsoft gives you two: there is the (old)JavascriptSerializer and the (new)JsonSerializer.

xanatos
  • 109,618
  • 12
  • 197
  • 280
  • 1
    You are correct, this is confirmed by Newtonsoft in [Support "strict mode" for RFC7159 parsing #646](https://github.com/JamesNK/Newtonsoft.Json/issues/646#issuecomment-356828682): *I'm not sure. I get the feeling that if I add a strict mode as just a bool flag on JsonTextReader then people will start asking for more options...*. – dbc Jan 05 '21 at 23:10
  • BTW [`JavaScriptSerializer`](https://learn.microsoft.com/en-us/dotnet/api/system.web.script.serialization.javascriptserializer) hasn't even been ported to .NET Core. `JsonSerializer`. `DataContractJsonSerializer`, [`JsonReaderWriterFactory.CreateJsonReader()`](https://learn.microsoft.com/en-us/dotnet/api/system.runtime.serialization.json.jsonreaderwriterfactory.createjsonreader) and [`Utf8JsonReader`](https://learn.microsoft.com/en-us/dotnet/api/system.text.json.utf8jsonreader) are the options there. – dbc Jan 05 '21 at 23:18
  • 1
    [This answer](https://stackoverflow.com/a/59983507/3744182) by [lidgren](https://stackoverflow.com/a/59983507/3744182) using `Utf8JsonReader` looks to be the simplest way to check JSON validity in .NET Core. – dbc Jan 05 '21 at 23:18
  • Thats what I was afraid of. Thanks for the details. – bpeikes Jan 06 '21 at 02:27