0

I am working with JSON to communicate data between two systems. One of the properties in JSON is rich text. Most of the times there are no problems but once in a blue moon special characters like curly quotes which are not UTF-8 characters make it into the rich text.

I want to replace these special characters with their UTF-8 equivalents. How can I achieve this in C Sharp?

Example of this string - “Cops bring lettuce & tomato, dispose of evidence,”. If I create a regular quote it's like this - "

Thanks

Oded
  • 489,969
  • 99
  • 883
  • 1,009
Gabbar
  • 4,006
  • 7
  • 41
  • 78
  • 2
    What curly quotes are not UTF-8? Post an example please. – Oded Sep 05 '12 at 20:15
  • How do you form your json string? – L.B Sep 05 '12 at 20:17
  • Those characters are no more special than `a`, `b` or `c`. Neither JSON nor C# should have any problem with them. How do you obtain them, and how to you transmit them? As you're doing something wrong. Also, how does `“` end up looking, as that can be a clue. – Jon Hanna Sep 05 '12 at 20:28
  • In Textpad, “ look like black boxes. – Gabbar Sep 05 '12 at 20:36
  • Textpad? Does your C# dump to file? That introduces the possibility that the JSON parsing is perfect, and the bug is after it. – Jon Hanna Sep 05 '12 at 20:39
  • When I paste the JSON in Textpad, the “ look like black boxes. I am reading the code from a rich text field in a CMS. – Gabbar Sep 05 '12 at 20:40
  • Wait, but it looks like a proper `“` in the CMS itself? What's the actual bug? (Rather than it not working in Textpad, that's a bug in Textpad that Textpad don't consider a bug and aren't fixing). – Jon Hanna Sep 05 '12 at 20:46

1 Answers1

0

The quotes you posted are sometimes called "smart quotes" - “”. They are UTF-8, but are not proper JSON (and most programming language) quotes.

They are the kind of quotes produced from pasting code into Word.

The fix it to replace both characters with quotes that are valid for JSON (that is ").

If these appear in the JSON values, you need to escape them with a \ - so instead of " you will use \".

Also, take a look at this question and its answers - make sure that the server returns the JSON response as UTF-8 and not some other encoding.

Community
  • 1
  • 1
Oded
  • 489,969
  • 99
  • 883
  • 1,009
  • Oh. Since they're valid for JSON, I assumed the “” was part of the content rather than being passed through as code. If you're on the right track with this one, I smell bigger problems. – Jon Hanna Sep 05 '12 at 20:32
  • @JonHanna - You may very well be right about it being content. I just saw [this question](http://stackoverflow.com/questions/3161487/using-single-smart-quote-in-my-json-data-is-breaking-php-script) - the accepted answer suggests that the content is passed through as `windows-1252` instead of `UTF-8`! – Oded Sep 05 '12 at 20:34
  • The quotes are part of the content not the property defining one. – Gabbar Sep 05 '12 at 20:35
  • @Gabbar - OK, but something seems to translate smart quotes as dumb quotes. You need to check that the encoding of the JSON is correct (what is actually being sent, what is actually being received). – Oded Sep 05 '12 at 20:37