0

I use external API to migrate some data to my application. All data in JSON format. And there are links to images in this data.

The problem is that some links has HTML encoding sequences instead of real symbols. E.G:

  1. What i got:

    externalApiPath/live/userfiles/Image/img_CS FSTD(H).300 AMC1 cont`d_1.PNG

  2. What i want to get:

    externalApiPath/live/userfiles/Image/img_CS FSTD(H).300 AMC1 cont`d_1.PNG

And I can't get the image by placing externalApiPath/live/userfiles/Image/img_CS FSTD(H).300 AMC1 cont`d_1.PNG link in the browser. But I can do it with externalApiPath/live/userfiles/Image/img_CS FSTD(H).300 AMC1 cont`d_1.PNG.

So they replaced ` symbol with ` code. What I need to do to covert all ISO 8859-1 codes in the whole text to the corresponding symbols.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • 1
    You may be confused with ISO 8859-1 (aka Latin 1), which is an encoding. Even so it's actually rarely used, with Windows-1252 (which is much like it, but not identical) being more common. – Jeroen Mostert Nov 03 '21 at 12:39
  • 2
    ISO8601 is a way of formatting dates: `YYYY-MM-DDTHH:mm:ss`. What you posted has nothing to do with dates. – Panagiotis Kanavos Nov 03 '21 at 12:41
  • 2
    How was this string generated? What is the *actual* JSON string? All JSON serializers will encode special characters in strings *and* decode them upon deserialization. `&` isn't such a character, so the code that generated this string has a bug. There was no reason to emit `\&` instead of `&`. All serializers will choke on this – Panagiotis Kanavos Nov 03 '21 at 12:42
  • @T.J.Crowder, sorry, i meant ISO 8859-1 encoding, not datetime format. Updated question – Denis Kaminsky Nov 03 '21 at 12:55
  • @T.J.Crowder, no there is no "\" in API response. I added it in StackOverflow editor because it replaced "`" with "`" by default. Now everything is okay. Updated question. – Denis Kaminsky Nov 03 '21 at 13:05
  • 2
    In that case, it's a duplicate of [this question](https://stackoverflow.com/questions/122641/how-can-i-decode-html-characters-in-c). This is basic [HTML character reference](https://html.spec.whatwg.org/multipage/syntax.html#character-references) decoding. *(Sorry for the error in my first edit, by the way.)* – T.J. Crowder Nov 03 '21 at 13:07
  • 1
    @DenisKaminsky that's not ISO 8859-1, ie plain old Latin 1 either. Which has no code or escape sequences. ``` is HTML encoding – Panagiotis Kanavos Nov 03 '21 at 13:11
  • 1
    @T.J.Crowder, you are right. Using HttpUtility.HtmlDecode i solved my problem. Thanks – Denis Kaminsky Nov 03 '21 at 13:11
  • @DenisKaminsky why was the problem created in the first place? Why was the data HTML-encoded before serializing to JSON? No JSON serializer would do this, this was done by code *before* generating the JSON string – Panagiotis Kanavos Nov 03 '21 at 13:12
  • @PanagiotisKanavos, because, as i said, i use external API to migrate some data to my application. I don't even have an access to it's source code and i don't know what's going on there. – Denis Kaminsky Nov 03 '21 at 13:14

0 Answers0