First of all, inspect what have you read (do you have correct encoding?):
string path = @"E:\PROJECTS\NETSLET\Console\Console\files\sample.txt";
// Easier way to read than Streams
string fileContent = File.ReadAllText(path);
string dump = string.Concat(fileContent
.Select(c => c < 32 || c > 127
? $"\\u{(int)c:x4}" // Encode command chars and unicode ones
: c.ToString())); // preserve ASCII intact
Console.Write(dump);
If you get (please, notice \u2013
characters)
Match List \u2013 I and List \u2013 II and identify the correct code :
then the reading is correct and it's output which is wrong. You should change the font you are using. If dump doesn't look like above, but as (please, notice ?
):
Match List ? I and List ? II and identify the correct code :
It means that the system can't read the characters and thus substitute it with ?
; so the problem is in the reading, is in the encoding. Try putting it explicitly
// Utf-8
string fileContent = File.ReadAllText(path, Encoding.UTF8);
...
// Win-1250
string fileContent = File.ReadAllText(path, Encoding.GetEncoding(1250));
Edit: In worse case, when you can't just save the file with required encoding, but you have to guess the original one you can try automating the process:
string path = "";
var tries = Encoding.GetEncodings()
.Select(encoding => new {
encoding = encoding,
text = File.ReadAllText(path, encoding.GetEncoding()),
} )
.Select(item => $"{item.encoding.Name,-8} => {item.text} <- {(item.text.Any(c => c == 0x2013 ? "got it!" : "wrong"))}");
Console.WriteLine(string.Join(Environment.NewLine, tries));
Possible output:
IBM037 => Match List ? I and List ? II and identify the correct code : <- wrong
IBM437 => Match List ? I and List ? II and identify the correct code : <- wrong
...
windows-1250 => Match List – I and List – II and identify the correct code : <- got it!
...
utf-8 => Match List ? I and List ? II and identify the correct code : <- wrong