-1

I have a file that contains some HTML code. I am trying to load this data into a C# Console app and transfer it into a JSON file to upload somewhere. When loading the file i am losing some of the encoding immediately when bringing the data in.

Example data

<li>Comfort Range: -60°F to 30°F / -50°C to -1°C</li>

Basic read file

//Load the file
String HTML_File = File.ReadAllText(location);
//Output the file to see the text
Console.WriteLine(HTML_File);

Console Output

<li>Comfort Range: -60??F to 30?F / -50?C to -1?C</li>

After i split the data how I need to, I than save the class to a JSON File

File.WriteAllText(OutputPath,JsonConvert.SerializeObject(HTMLDATA));

JSON file Data

<li>Comfort Range: -60�F to 30�F / -50�C to -1�C</li>

How can i go about loading this data and converting it to JSON without losing the encoding? I am still pretty new when it comes to encoding like this.

@JeremyLakeman helped me solve this, thank you sir!! When reading the text into the utility i needed to set the Encoding but not by the default ones.

File.WriteAllText(OutputPath,JsonConvert.SerializeObject(HTMLDATA), Encoding.GetEncoding("iso-8859-1"));
  • 1) replace `°` with html encoding `°`. 2) save your file as UTF-8 / 16. 3) maybe `Encoding.GetEncoding("Windows-1252")` ? – Jeremy Lakeman Feb 01 '22 at 00:06
  • I'm not having a problem printing to the console, I added Printing to the console so i could see the value of the string i was reading for troubleshooting, i didnt mention that i was having trouble printing to console so i am not sure how you came to that conclusion @AlexeiLevenkov – Samuel Dague Feb 01 '22 at 00:10
  • @JeremyLakeman If it was one or two issues like that i wouldnt mind but there are over 30,000 files with HTML characters. So i wouldnt want to do that for every single file, would there be a better way to do that when i read it into the console app? – Samuel Dague Feb 01 '22 at 00:12
  • 1
    Probably https://stackoverflow.com/a/7178744/4139809 – Jeremy Lakeman Feb 01 '22 at 00:14
  • @AlexeiLevenkov and the Solution you posted Didnt help out at all, even with doing it just in the console window the output still looks like this
  • Comfort Range: -60�F to 30�F / -50�C to -1�C
  • – Samuel Dague Feb 01 '22 at 00:17
  • @JeremyLakeman Thank you good sir! You shared something useful that actually helped me solve this! God among men sir! Thank you! – Samuel Dague Feb 01 '22 at 00:31
  • That particular question involves 8-bit characters because the user needed ASCII. You don't have that need so you should use UTF-8 or some other Unicode format. Please see the answer marked as duplicate instead. – siride Feb 01 '22 at 00:43
  • BTW, your issue does involve printing to the console. The console is incorrectly displaying the characters in your string. That doesn't mean the string or the file are corrupted. – siride Feb 01 '22 at 00:44
  • @siride Well yes TECHNICALLY is does involve printing to the console since i added this to troubleshoot the problem, however i feel that saying "It sound like you have problems with printing to console" is focusing on something that was just for troubleshooting and not going to be in the final solution, so ultimately irrelevant and probably the product of "let me hurry and close the post". Jeremy actually found a good solution and helped me. Jeremy > Alexi – Samuel Dague Feb 01 '22 at 01:32
  • Unless you verify that the characters are wrong through some other means, then you must first investigate whether the output mechanism isn't itself the problem. That's a legitimate issue, which I've had before. In any case, you should be using unicode and not old windows code pages. The duplicate issue explains thar so this question is closed correctly. – siride Feb 01 '22 at 01:52