2

I can't seem to figure out how to convert ISO-8859-1 characters, such as é, to it's entity number being é.

I want to be able to take a string, such as: "Steel Décor"

and have it converted to: "Steel Décor"

famousgarkin
  • 13,687
  • 5
  • 58
  • 74
tracstarr
  • 103
  • 10

3 Answers3

3

Assuming you don't care about HTML-encoding characters that are special in HTML (e.g., <, &, etc.), a simple loop over the string will work:

string input = "Steel Décor";
StringBuilder output = new StringBuilder();
foreach (char ch in input)
{
    if (ch > 0x7F)
        output.AppendFormat("&#{0};", (int) ch);
    else
        output.Append(ch);
}
// output.ToString() == "Steel D&#233;cor"

The if statement may need to be changed to also escape characters < 0x20, or non-alphanumeric, etc., depending on your exact needs.

Bradley Grainger
  • 27,458
  • 4
  • 91
  • 108
1

HttpUtility.HtmlEncode does that. It resides in System.Web.dll though so won't work with .NET 4 Client Profile for example.

liggett78
  • 11,260
  • 2
  • 29
  • 29
  • 1
    It does and it doesn't. It encodes the string, but not in the text format i'm looking for. It was the first thing i tried. I'm also not working with web stuff. – tracstarr Nov 25 '10 at 18:37
1

using LINQ

string toDec(string input)
{
    Dictionary<string, char> resDec =
        (from p in input.ToCharArray() where p > 127 select p).Distinct().ToDictionary(
            p => String.Format(@"&#x{0:D};", (ushort)p));

    foreach (KeyValuePair<string, char> pair in resDec)
        input = input.Replace(pair.Value.ToString(), pair.Key);
    return input;
}
Kendo
  • 41
  • 1
  • 5