1

I try to decode the word Mickaël here

using System;
using System.Net;
using System.Web;

public class Program
{
    public static void Main()
    {
        string s1 = WebUtility.HtmlDecode("Micka&#xEBl") ;
        string s2 = HttpUtility.HtmlDecode("Micka&#xEBl") ; 

        Console.WriteLine(s1);
        Console.WriteLine(s2);
    }
}

But I get this output :

Micka&#xEBl

Micka&#xEBl

So I need to know how can I decode this word properly ?

Lamloumi Afif
  • 8,941
  • 26
  • 98
  • 191

2 Answers2

6

Your input is incorrect. If you HtmlEncode Mickaël you end up with Mickaël.

string s1 = WebUtility.HtmlDecode("Mickaël") ;
Console.WriteLine(s1);

Outputs Mickaël

Owen Pauling
  • 11,349
  • 20
  • 53
  • 64
2

TL;DR Your input is missing a semicolon. It should be Mickaël.

If you check the specification, you can see that there are three ways to encode a character in HTML:

€      hexadecimal numeric character reference
€       decimal numeric character reference
€        named character reference

All of them start with a & and end with a ;.

ë can be encoded either as a named character reference (ë), or using its numeric character reference 235, encoded as either decimal (ë) or hexadecimal (ë).

Your input uses the hexadecimal encoding, but misses the final ;. If you add it back in, your code works: https://dotnetfiddle.net/7PPLu4

(@OwenPauling identified the problem with the input first. I was asked to post an answer expanding on the different ways to encode the ë in the comments)

canton7
  • 37,633
  • 3
  • 64
  • 77