I have to read a bad encoded string from a remote service and can not figure out how to recover the correct value in C# or Javascript. I can neither change the values in the service or change the way they are being saved in the DB, but I need to display them correctly.
Bad string: Adrián José
Correct string: Adrián José
The error can be undone since the fixed value can be obtained using tools such as https://www.iosart.com/tools/charset-fixer or in Notepad++ by changing the Encoding from ANSI to UTF-8.
So far, I have this solution in JS (client side), but I don't like to use the escape()
function and would like to do the fix on server side.
var badString = "Adrián José";
var fixedString = decodeURIComponent(escape(badString)); // "Adrián José"
I tried to play with the Encoding class in C# (like here), but couln't find a valid combination.
var badString = "Adrián José";
var origEnco = Encoding.UTF8;
var targetEnco = Encoding.Default;
byte[] utfBytes = origEnco.GetBytes(badString);
byte[] isoBytes = Encoding.Convert(origEnco, targetEnco, utfBytes);
string fixedString = targetEnco.GetString(isoBytes); // "Adrián José"
What am I missing? How do the character set fixer or Notepad++ work?