I have a simple test page in UTF-8 where text with letters in multiple different languages gets stringified to JSON:
HTML:
<textarea id="txt">
検索 • Busca • Sök • 搜尋 • Tìm kiếm • Пошук • Cerca • Søk • Haku • Hledání • Keresés • 찾기 • Cari • Ara • جستجو • Căutare • بحث • Hľadať • Søg • Serĉu • Претрага • Paieška • Poišči • Cari • חיפוש • Търсене • Іздеу • Bilatu • Suk • Bilnga • Traži • खोजें
</textarea>
<button id="encode">Encode</button>
<pre id="out">
</pre>
JavaScript:
$("#encode").click(function () {
$("#out").text(JSON.stringify({ txt: $("#txt").val() }));
}).click();
While I expect the non-ASCII characters to be escaped as \uXXXX as per the JSON spec, they seem to be untouched. Here's the output I get from the above test:
{"txt":"検索 • Busca • Sök • 搜尋 • Tìm kiếm • Пошук • Cerca • Søk • Haku • Hledání • Keresés • 찾기 • Cari • Ara • جستجو • Căutare • بحث • Hľadať • Søg • Serĉu • Претрага • Paieška • Poišči • Cari • חיפוש • Търсене • Іздеу • Bilatu • Suk • Bilnga • Traži • खोजें\n"}
I'm using Chrome, so it should be the native JSON.stringify
implementation. The page's encoding is UTF-8. Shouldn't the non-ASCII characters be escaped?
What brought me to this test in the first place is, I noticed that jQuery.ajax
doesn't seem to escape non-ASCII characters when they appear in a data object property. The characters seem to be transmitted as UTF-8.