It is not the serializer that is causing this issue; Json.Net handles foreign characters just fine. More likely you are doing one of the following:
- Using an inappropriate encoding (or not setting the encoding) when writing the JSON to a file or stream. You should probably be using
Encoding.UTF8
.
- Storing the JSON into a
varchar
column in your database rather than nvarchar
. varchar
does not support unicode characters.
- Viewing the JSON with a viewer that does not support unicode, uses the wrong encoding and/or uses a font that does not have the full set of unicode character glyphs. The Windows command prompt window seems to have this issue, for example.
To prove that the serializer is not the problem, try compiling and running the following example program. It will create two different output files from the same JSON, one using UTF-8 encoding and the other using the default encoding. Open each file using Notepad. The "default" file will have the foreign characters as ?
characters. In the UTF-8 encoded file, you should see all the characters are intact. (If you still don't see them, try changing the Notepad font to "Arial Unicode MS".)
You can also see the foreign characters are correct in the JSON using the Visual Studio debugger; just put a breakpoint after the line where it serializes the JSON and examine the json
variable.
using System;
using System.Collections.Generic;
using System.IO;
using Newtonsoft.Json;
class Program
{
static void Main(string[] args)
{
List<Foo> foos = new List<Foo>
{
new Foo { Language = "Hebrew", Sample = "אספירין" },
new Foo { Language = "Hindi", Sample = "एस्पिरि" },
new Foo { Language = "Chinese", Sample = "阿司匹林" },
new Foo { Language = "Japanese", Sample = "アセチルサリチル酸" },
};
var json = JsonConvert.SerializeObject(foos, Formatting.Indented);
File.WriteAllText("utf8.json", json, Encoding.UTF8);
File.WriteAllText("default.json", json, Encoding.Default);
}
}
class Foo
{
public string Language { get; set; }
public string Sample { get; set; }
}