1

I have a very nice function which encodes ASCII to russian charachters, however I need it also the other way around from russian to ASCII.

The function I have is:

public string DecodeEncodedNonAsciiCharacters(string value)
    {
        return Regex.Replace(
            value,
            @"\\u(?<Value>[a-zA-Z0-9]{4})",
            m =>
            {
                return ((char)int.Parse(m.Groups["Value"].Value, NumberStyles.HexNumber)).ToString();
            });
    }

I cant find a good way to get \u235 in my text or any other way to escape these type of characters

user3763117
  • 327
  • 1
  • 5
  • 18
  • I suspect this is duplicate pf [Decode Unicode character](http://stackoverflow.com/questions/9303257/how-to-decode-a-unicode-character-in-a-string) but without sample it is hard to say for sure... Would you mind to show sample string/expected output? – Alexei Levenkov Feb 09 '15 at 00:38
  • No its the other way around, I have Russian characters in a html string (dont have on this machine russian keyboard) and want to convert it to ASCII. The problem is as soon I post the data to my WS the ???? shows up instead of Russian between the html tags – user3763117 Feb 09 '15 at 00:40
  • the html looks like this Я легко довольствуюсь самым лучшим. (Уинстон Черчилль) – user3763117 Feb 09 '15 at 01:41

1 Answers1

2

Something like this? (Fiddle: https://dotnetfiddle.net/6BbXAt)

public static string EncodeNonAsciiCharacters(string value)
{
  return Regex.Replace(
    value,
    @"[^\x00-\x7F]",
    m => String.Format("\\u{0:X4}", (int)m.Value[0]));
}

The regex is from (grep) Regex to match non-ASCII characters?

Community
  • 1
  • 1
gaiazov
  • 1,908
  • 14
  • 26