-1

I need to convert my html file that is in charset = iso-8859-1 to UTF-8, could you help me?

this is my code:

 string converHtml = File.ReadAllText(html);
           
            Encoding iso = Encoding.GetEncoding("windows-1252");
            Encoding utf8 = Encoding.UTF8;
            byte[] isoBytes = iso.GetBytes(converHtml);
            byte[] utf8Bytes = Encoding.Convert(utf8, iso, isoBytes);
            string msg = utf8.GetString(utf8Bytes);
         
            msg = HttpUtility.HtmlDecode(msg);

            return msg;
jonathan
  • 57
  • 6
  • What's the problem with what you have? – see sharper Jul 01 '20 at 01:35
  • Why don't you provide the Encoding already when reading the file? See https://learn.microsoft.com/en-us/dotnet/api/system.io.file.readalltext?view=netcore-3.1#System_IO_File_ReadAllText_System_String_System_Text_Encoding_ – Klaus Gütter Jul 01 '20 at 01:55
  • Most likely should be duplicate of https://stackoverflow.com/questions/12130290/how-to-read-text-files-with-ansi-encoding-and-non-english-letters , but code is very confusing so maybe there is something else... – Alexei Levenkov Jul 01 '20 at 02:06

1 Answers1

1

thanks Klaus Gütter, Alexei Levenkov, it worked for me. this is my code:

StreamReader sr = new StreamReader(html, Encoding.GetEncoding(28591));

var ags = sr.ReadToEnd();
jonathan
  • 57
  • 6