10

I have an UTF8 file, which I have added to my project in Resources.resx, called Template.txt

If I read the file normally like this:

string template = File.ReadAllText(@"filepath\Template.txt", Encoding.UTF8);

Everything works fine.

However if I read it like this:

string template = Properties.Resources.Template

It is filled with Japanese characters, and thus has the wrong encoding.

byte[] bytes = Encoding.Default.GetBytes(Properties.Resources.Template);
string template = Encoding.UTF8.GetString(bytes);

This also still gives Japanese characters.

Does anyone know the cause? If I just double click the Template.txt file in Visual Studio, I can just read it normally also.

rory.ap
  • 34,009
  • 10
  • 83
  • 174
Red Riding Hood
  • 1,932
  • 1
  • 17
  • 36
  • 2
    You probably did not specify UTF8 in the inclusion to Resources.resx, so it was garbled when it was turned into a resource. As a result, there is no way to get the resource back ungarbled. See how you can specify UTF8 in Resources.resx. – Mike Nakis Sep 30 '16 at 11:21
  • If the text appears wrong, it means that you didn't store it as Unicode. Just make sure the resource is actually stored as Unicode – Panagiotis Kanavos Sep 30 '16 at 11:22
  • 4
    When you embed a text file as a resource then the resource manager makes an effort to embed the file as text so that encoding plays no role. That's why you can get a string from the Template property. But as you can tell, it could not figure out that the text file contains utf8. So it guessed wrong and used the system default code page, turns into gibberish if the text file contains non-ASCII characters. Open the file with a text editor, even Notepad can do it, save it back as utf8 so that a BOM is included. – Hans Passant Sep 30 '16 at 11:28

3 Answers3

7

As Hans Passant said in the comments, encoding the file so that it includes the UTF-8 BOM (Byte Order Mark) fixed the issue.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
Red Riding Hood
  • 1,932
  • 1
  • 17
  • 36
2

It may be the encoding that is described inside the .resx file.

Open the .resx file with a notepad and you will notice some entries like the following:

  <data name="file1" type="System.Resources.ResXFileRef, System.Windows.Forms">
    <value>..\Resources\file1.html;System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089;Windows-1252</value>
  </data>
  <data name="file2" type="System.Resources.ResXFileRef, System.Windows.Forms">
    <value>..\Resources\file2.html;System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089;utf-8</value>
  </data>

You can change then the Windows-1252 to utf-8 and save it.

dkokkinos
  • 361
  • 2
  • 9
1

In VS, select the Resources.resx file, then go File > Save Resources.resx As...

and Save with Encoding... and select Unicode(UTF-8 without signature) - Codepage 65001 and OK.

Joel Wiklund
  • 1,697
  • 2
  • 18
  • 24