1

Question:

I query a Quake3 masterserver via UDP, and get the response as below. As you can see, I had trouble figuring out the encoding of what the server sent... Is there any way to detect or set the receive encoding ?

            baBuffer = new byte[1024*100]; // 100 kb should be enough
        int recv = sctServerConnection.ReceiveFrom(baBuffer, ref tmpRemote);

        Console.WriteLine("Message received from {0}:", tmpRemote.ToString());

        System.Text.Encoding encResponseEncoding = System.Text.Encoding.Default; // Wrong...
        //encResponseEncoding = System.Text.Encoding.ASCII;
        //encResponseEncoding = System.Text.Encoding.UTF8;
        //encResponseEncoding = System.Text.Encoding.GetEncoding(437); // ANSI-DOS
        //encResponseEncoding = System.Text.Encoding.GetEncoding(1252);// ANSI-WestEurope
        //encResponseEncoding = System.Text.Encoding.GetEncoding(1250); // Ansi-Centraleuro
        //encResponseEncoding = System.Text.Encoding.GetEncoding("ISO-8859-1");
        //encResponseEncoding = System.Text.Encoding.GetEncoding("ISO-8859-9");
        //encResponseEncoding = System.Text.Encoding.UTF32;
        encResponseEncoding = System.Text.Encoding.UTF7; // Bingo !
soulmerge
  • 73,842
  • 19
  • 118
  • 155
Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442

3 Answers3

1

The encoding (if it is actually text) is determined by the protocol. If you don't have a protocol spec and you don't have the source code then, yes, you'll have to guess.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
1

There is no way to safely detect encoding, you can just guess it. See also How can I detect the encoding/codepage of a text file.

Community
  • 1
  • 1
svick
  • 236,525
  • 50
  • 385
  • 514
1

You can look for the Byte Order Mark (BOM). Here's some VB.Net code that I use:

Private Shared Function GetStringFromBytes(ByVal bytes() As Byte) As String
    Dim ByteLegth = bytes.Count
    If (ByteLegth >= 3) AndAlso (bytes(0) = &HEF) AndAlso (bytes(1) = &HBB) AndAlso (bytes(2) = &HBF) Then
        Return System.Text.Encoding.UTF8.GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &HFE) AndAlso (bytes(1) = &HFF) Then
        Return System.Text.Encoding.BigEndianUnicode.GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &HFF) AndAlso (bytes(1) = &HFE) Then
        Return System.Text.Encoding.Unicode.GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &H0) AndAlso (bytes(1) = &H0) AndAlso (bytes(2) = &HFE) AndAlso (bytes(3) = &HFF) Then
        Return New System.Text.UTF32Encoding(True, True).GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &HFF) AndAlso (bytes(1) = &HFE) AndAlso (bytes(2) = &H0) AndAlso (bytes(3) = &H0) Then
        Return System.Text.Encoding.UTF32.GetString(bytes)
    Else
        'No BOM, assume ASCII
        Return System.Text.Encoding.ASCII.GetString(bytes)
    End If
End Function
Chris Haas
  • 53,986
  • 12
  • 141
  • 274