2

I saw similar topics but could not find a solution. My problem is that I have a .txt file in which the symbols are in Bulgarian language / which is Cyrillic /, but after trying to read them, there is no sucess. I tried to read with this code:

StreamReader reader = new StreamReader(fileName,Encoding.UTF8);

if (File.Exists(fileName))
{
    while ((line = reader.ReadLine()) != null)
    {
        Console.WriteLine(line);
    }
}

And I also changed the Encoding value to all possible , as I tried with GetEncoding(1251), which I wrote is for cyrillic. And when I save the .txt file I tried to save it with each different encoding which was there / UNICODE,UTF-8,BigEndianUnicode,ANSI / in each combination with the Encoding I am settin through the code, but again no success.

Any ideas for how to read the cyrillic symbols in the right way will be appriciated. And here is sample text for this: "Ето примерен текст."

Thanks in advance! :)

casperOne
  • 73,706
  • 19
  • 184
  • 253
Yoan Petrov
  • 99
  • 2
  • 7
  • You know the content but you also have to know the encoding of those files. Trying them all is a way to find out, viewing in a Hex viewer might be more efficient. – H H Oct 19 '11 at 13:02

2 Answers2

5

Your problem is that the console can't show cyrillic characters. Try putting a breakpoint on the Console.WriteLine and inspect the line variable. Clearly you'll need to know the correct encoding first! :-)

If you don't trust me, try this: make a console program that does this:

string line = "Ето примерен текст"; 
Console.WriteLine(line);
return 0;

put a breakpoint on the return 0;, watch the console and watch the line variable.

I'll add that unicode consoles should be one of the "new" things in .NET 4.5

And you can try to read this page: c# unicode string output

Community
  • 1
  • 1
xanatos
  • 109,618
  • 12
  • 197
  • 280
  • 1
    The console is not the point here , i am using it in ASP.NET application with proper encoding , the problem is when I get in the codebehing a string , which is read from a text file with proper encoding the string in debug mode is "?????" ... – Yoan Petrov Oct 26 '11 at 10:52
3

The problem you are having is not reading the text, but displaying it.

If your real intention is to display Unicode text in a console window, then you'll have to make a few changes. If however, you will be displaying the text in a WinForms or WPF app for instance, then you will not have problems - they work with Unicode by default.

By default, the console will not handle unicode, or use a font which has unicode glyphs. You need to do the following:

  1. Save your text file as UTF8.
  2. Start a console which is unicode enabled: cmd \u
  3. Change the font to "Lucida Sans Unicode": console window menu -> properties -> font
  4. Change the codepage to Unicode: chcp 65001
  5. Run your app.

Your characters will now be displayed correctly:

enter image description here

Tim Lloyd
  • 37,954
  • 10
  • 100
  • 130