Can't fully correct encoding issue from website

Asked Apr 10 '15 at 00:24

Active Apr 10 '15 at 00:51

Viewed 67 times

I'm developing a web scraper for a soccer site. There is an issue when pulling in names from various countries (which, obviously, have various characters). I've worked out a method to correct some of it, but it's not catching Turkish or anything else. Here is what I have so far:

private String formatMe(String sF)
    {
      String myString = Encoding.UTF8.GetString(Encoding.UTF8.GetBytes(sF))
         .Replace("Ã©", "é")
         .Replace("Ã¡", "á")
         .Replace("Ã", "í")
         .Replace("Ã³", "ó");

      return myString;

    }//END FORMAT

Here's an example of a site I would pull from.

Is there anyway I can just fix the encoding from the site in one fell swoop?

edited Apr 10 '15 at 00:51

stuartd

70,509
14
132
163

asked Apr 10 '15 at 00:24

jDave1984

What is not working? – rory.ap Apr 10 '15 at 00:25
Characters are coming in all wonky. For instance M. Özil is coming in as M. Ã–zil. – jDave1984 Apr 10 '15 at 00:26

Can't fully correct encoding issue from website

0 Answers0