1

I make a service call and I get this string "lastName":"Düsedau"

As you can see this is quite strange, but if you convert this to UTF-8, it is correct. https://encoder.mattiasgeniar.be/index.php

The problem is that in the UI it appears that weird chracters even tho I have charset utf-8

<meta charset="utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
 <meta name="description" content="'moduleApp'" />
    <meta name="viewport" content="width=device-width" />

My Service:

promises.People.$promise.then(function(data) {
        this.people = data.People; // JSON that has lastname
    });

How can I remove this weird characters from the json? I use AngularJS

user11341081
  • 199
  • 7
  • 1
    The server is not sending UTF8 – mplungjan Jul 10 '19 at 11:39
  • And if you control the source...fix it there. – charlietfl Jul 10 '19 at 11:40
  • @mplungjan let's assume it is impossible to change that. Can I fix on the UI? – user11341081 Jul 10 '19 at 11:40
  • Your server says it sends UTF-8 (_Content-Type: text/html; charset=UTF-8_). Encode the file using UTF-8 when saving it. – Teemu Jul 10 '19 at 11:41
  • 1
    @user11341081 sure, you can parse the string and decode UTF-8 multibytes into the proper character ... but seriously? Fix the Source. wether the encoding of the file is wrong or the Server doesn't include the charset in the `Content-Type` header, fix that! Don't mess around with broken data. – Thomas Jul 10 '19 at 11:45

3 Answers3

2

You can do this if you absolutely cannot fix the server

console.log(decodeURIComponent(escape(`"lastName":"Düsedau"`)))

Alternatively have a proxy read the latin and re-encode it in UTF8 before sending it to your client

mplungjan
  • 169,008
  • 28
  • 173
  • 236
-2

ü (as in 0xC3 0xBC) is mojibake for ü, aka U+00FC 'LATIN SMALL LETTER U WITH DIAERESIS'. You obtain it when you encode it as UTF-8 and parse/render it as some single byte encoding such as ISO-8859-1 or Windows-1252, where 0xC3 stands for U+00C3 'LATIN CAPITAL LETTER A WITH TILDE' and 0xBC stands for U+00BC 'VULGAR FRACTION ONE QUARTER'. This suggests that data was originally stored as UTF-8 and was misinterpreted somewhere upstream.

The very first check you could make would be to open your browser's developer tools and inspect the server's response, taking into account that valid JSON can only be encoded as UTF-8. Move on from there.

Álvaro González
  • 142,137
  • 41
  • 261
  • 360
-2

maybe you should try this

<meta contentType="text/html; charset=UTF-8"/>
boekenenbroeken
  • 649
  • 1
  • 5
  • 7