I have a simple C# Windows Form app that converts rich text (pasted from PDF or DOC) into RTF markup. However whenever I paste in Asian characters like this:
ありがとうございました
the RTF markup converts the characters to their hexadecimal equivalents like this:
\'82\'a0\'82\'e8\'82\'aa\'82\'c6\'82\'a4\'82\'b2\'82\'b4\'82\'a2\'82\'dc\'82\'b5\'82\'bd
Anyone know if it's possible to prevent this and have the RTF markup retain the actual Asian characters? I know it's theoretically possible because if I paste the actual characters into the RTF markup window they do not get converted.
Per request here's the actual code that pushes the RTF text into a plain text field ... which is the point where the Asian characters get converted:
private void rtfBox_TextChanged(object sender, EventArgs e){
plainTextBox.Text = rtfBox.Rtf.ToString();
}
Source of the app here if anyone wants to see further: https://github.com/cemerson/RTFMarkupHelper
Related/Possible duplicates:
- Best way to decode hex sequence of unicode characters to string
- How to output unicode string to RTF (using C#)
- How to convert a string to RTF in C#?
- Output RTF special characters to Unicode
- Display of Asian characters (with Unicode): Difference in character spacing when presented in a RichEdit control compared with using ExtTextOut
- Converting Unicode strings to escaped ascii string
- How can I decode HTML characters in C#?
- How to encode and decode Broken Chinese/Unicode characters?