Questions tagged [ucs2]

Universal Character Set-2 is an early version of Unicode that has been superseded by the Unicode UTF-16 standard

UCS-2 is limited to 65.535 characters and produces a fixed-length format by simply using the code point as the 16-bit code unit

In UCS-2 each character is represented by 16 bits or 2 bytes. (The number 2 in UCS-2 indicates 2 bytes.)

For example:

Uppercase A is represented by 0041. This encoding is no longer sufficient and has been superseded by the UTF-16 encoding.

UCS-2 was superseded by UTF-16 in version 2.0 of the Unicode standard in July 1996.

Read more

121 questions

votes

7 answers

How to find out if Python is compiled with UCS-2 or UCS-4?

Just what the title says. $ ./configure --help | grep -i ucs --enable-unicode[=ucs[24]] Searching the official documentation, I found this: sys.maxunicode: An integer giving the largest supported code point for a Unicode character. The value…

python unicode ucs2

asked Sep 18 '09 at 19:06

Sridhar Ratnakumar

81,433
63
146
187

votes

4 answers

What version of Unicode is supported by which .NET platform and on which version of Windows in regards to character classes?

Updated question ¹ With regards to character classes, comparison, sorting, normalization and collations, what Unicode version or versions are supported by which .NET platforms? Original question I remember somewhat vaguely having read that .NET…

c# .net utf-16 ucs2 astral-plane

asked Feb 06 '12 at 15:04

Abel

56,041
24
146
247

votes

1 answer

Python 3: reading UCS-2 (BE) file

I can't seem to be able to decode UCS-2 BE files (legacy stuff) under Python 3.3, using the built-in open() function (stack trace shows UnicodeDecodeError and contains my readLine() method) - in fact, I wasn't able to find a flag for specifying this…

file python-3.x ucs2

asked Jan 23 '13 at 20:02

elder elder

votes

3 answers

How to convert a Unicode text-block to UTF-8 (HEX) code point?

I have a Unicode text-block, like this: ụ ư ứ Ỳ Ỷ Ỵ Đ Now, I want to convert this orginal Unicode text-block into a text-block of UTF-8 (HEX) code point (see the Hexadecimal UTF-8 column, on this page: https://en.wikipedia.org/wiki/UTF-8), by PHP;…

php regex encoding utf-8 ucs2

asked Jul 19 '15 at 13:27

user5132285

votes

1 answer

What is the maximum number of characters in an USSD message?

I've understood that an USSD message consists of 160 bytes. For 7 bit data coding schemes, the maximum number of characters is 160*8/7 which gives 182 characters. It's unclear to me what is the maximum number of characters for UCS2 encoding.…

gsm ussd ucs2

asked Feb 02 '12 at 09:34

Victor Ionescu

1,967
2
21
24

votes

1 answer

python base64 string decoding

I've got what's supposed to be a UCS-2 encoded xml document that I've managed to build a DOM based on minidom after some tweaking. The issue is that I'm supposed to have some data encoded on base64. I know for a fact that: AME= (or…

python encoding ucs2

asked Aug 03 '11 at 07:42

bleeding edge

votes

8 answers

C++ strings: UTF-8 or 16-bit encoding?

I'm still trying to decide whether my (home) project should use UTF-8 strings (implemented in terms of std::string with additional UTF-8-specific functions when necessary) or some 16-bit string (implemented as std::wstring). The project is a…

c++ encoding utf-8 stdstring ucs2

asked Sep 19 '08 at 16:15

Carl Seleborg

13,125
11
58
70

votes

3 answers

'UCS-2' codec can't encode characters in position 1050-1050

When I run my Python code, I get the following errors: File "E:\python343\crawler.py", line 31, in print (x1) File "E:\python343\lib\idlelib\PyShell.py", line 1347, in write return self.shell.write(s,…

python unicode encoding ucs2

asked Sep 07 '15 at 16:10

Andi

votes

3 answers

best way to detect number of SMS needed to send a text

I'm looking for a code/lib in php that I will call it and pass a text to it and it will tell me: What is the encode I need to use in order to send this text as SMS (7,8,16 bit) How many SMS message I will use to send this text (it must be smart to…

php encoding sms ascii ucs2

asked Dec 01 '11 at 23:25

AFT

votes

2 answers

What are the consequences of storing a C# string (UTF-16) in a SQL Server nvarchar (UCS-2) column?

It seems that SQL Server uses Unicode UCS-2, a 2-byte fixed-length character encoding, for nchar/nvarchar fields. Meanwhile, C# uses Unicode UTF-16 encoding for its strings (note: Some people don't consider UCS-2 to be Unicode, but it encodes all…

sql-server character-encoding utf-16 ucs2 codepoint

asked Apr 13 '11 at 20:36

Triynko

18,766
21
107
173

votes

1 answer

R: can't read unicode text files even when specifying the encoding

I'm using R 3.1.1 on Windows 7 32bits. I'm having a lot of problems reading some text files on which I want to perform textual analysis. According to Notepad++, the files are encoded with "UCS-2 Little Endian". (grepWin, a tool whose name says it…

windows r unicode encoding ucs2

asked Oct 10 '14 at 18:34

s_a

votes

1 answer

Change encoding (collation?) of SQL Server 2008 R2 to UTF-8

We'd like to move our Confluence system to a SQL Server 2008 R2. Now, since Confluence uses UTF-8 encoding, I'd need a database using the same encoding (I guess that's the collation?). There's the command alter database confluence set collation…

utf-8 sql-server-2008-r2 collation ucs2

asked Nov 23 '12 at 12:12

Ahatius

4,777
11
49
79

votes

3 answers

Storing UTF-16/Unicode data in SQL Server

According to this, SQL Server 2K5 uses UCS-2 internally. It can store UTF-16 data in UCS-2 (with appropriate data types, nchar etc), however if there is a supplementary character this is stored as 2 UCS-2 characters. This brings the obvious issues…

sql-server unicode utf-16 ucs2

asked Apr 30 '09 at 03:39

David Cameron

votes

2 answers

UCS-2 and SQL Server

While researching options for storing mostly-English-but-sometimes-not data in a SQL Server database that can potentially be quite large, I'm leaning toward storing most string data as UTF-8 encoded. However, Microsoft chose UCS-2 for reasons that I…

sql-server unicode utf-8 utf-16 ucs2

asked Jan 25 '12 at 18:22

Eric J.

147,927
63
340
553

votes

4 answers

2-byte (UCS-2) wide strings under GCC

when porting my Visual C++ project to GCC, I found out that the wchar_t datatype is 4-byte UTF-32 by default. I could override that with a compiler option, but then the whole wcs* (wcslen, wcscmp, etc.) part of RTL is rendered unusable, since it…

c++ gcc right-to-left widestring ucs2

asked May 07 '10 at 17:28

Seva Alekseyev

59,826
25
160
281

2 3

…

8 9 Next