12

Is there any function in Vb.net (or C#) that encodes a string in UCS2?

Thanks

kassar
  • 191
  • 1
  • 2
  • 5

4 Answers4

15

Use the following functions to encode unicode string in "UCS2" format:

    //================> Used to encoding GSM message as UCS2
    public static String UnicodeStr2HexStr(String strMessage)
    {
        byte[] ba = Encoding.BigEndianUnicode.GetBytes(strMessage);
        String strHex = BitConverter.ToString(ba);
        strHex = strHex.Replace("-", "");
        return strHex;
    }

    public static String HexStr2UnicodeStr(String strHex)
    {
        byte[] ba = HexStr2HexBytes(strHex);
        return HexBytes2UnicodeStr(ba);
    }

    //================> Used to decoding GSM UCS2 message  
    public static String HexBytes2UnicodeStr(byte[] ba)
    {
        var strMessage = Encoding.BigEndianUnicode.GetString(ba, 0, ba.Length);
        return strMessage;
    }

    public static byte[] HexStr2HexBytes(String strHex)
    {
        strHex = strHex.Replace(" ", "");
        int nNumberChars = strHex.Length / 2;
        byte[] aBytes = new byte[nNumberChars];
        using (var sr = new StringReader(strHex))
        {
            for (int i = 0; i < nNumberChars; i++)
                aBytes[i] = Convert.ToByte(new String(new char[2] { (char)sr.Read(), (char)sr.Read() }), 16);
        }
        return aBytes;
    }

for example:

String strE = SmsEngine.UnicodeStr2HexStr("سلام به گچپژ پارسي");
// strE = "0633064406270645002006280647002006AF0686067E06980020067E062706310633064A"
String strD = SmsEngine.HexStr2UnicodeStr("0633064406270645002006280647002006AF0686067E06980020067E062706310633064A");
// strD = "سلام به گچپژ پارسي"
Behzad Ebrahimi
  • 992
  • 1
  • 16
  • 28
7

No, .NET supports the full Unicode range for strings and many encodings that derive from System.Text.Encoding. You can trivially get UTF-16, but not UCS-2. However, if you first get rid of all surrogate pairs in the input string, then UTF-16 is UCS-2. But there's no built-in encoding that does that for you.

Joey
  • 344,408
  • 85
  • 689
  • 683
  • How likely is it he'll ever come across any characters out of the first plane, though? Or fonts to display them? – Rup Aug 09 '10 at 08:32
  • Rup: Without knowing context that's hard to tell. Many of my Word documents have large amounts of Plane 1 characters (remember: the math Latin alphabets are there), albeit the other astral planes are somewhat rare, indeed. In any case, if they specifically ask for UCS-2 I'd assume they know the difference to UTF-16 and also know why they need UCS-2. In the end it's not for us to speculate whether or whether not comething else would be right here :-) – Joey Aug 09 '10 at 08:38
1

See Encoding.Unicode.

Given a .NET String, call Encoding.GetBytes to get a byte array representing that string encoded in UCS2.

Edit: In the context of System.Text.Encoding, Unicode = UTF-16. As Johannes points out, these are not the same thing in the presence of surrogates.

Tim Robinson
  • 53,480
  • 10
  • 121
  • 138
0

I think String.Normalize() will do what you want.

String.Normalize()

https://learn.microsoft.com/en-us/dotnet/api/system.string.normalize?view=netframework-4.8

zezba9000
  • 3,247
  • 1
  • 29
  • 51